Stratified random sampling

July 27, 2023

10 MIN READ

304 VIEWS

Researchers can use stratified random sampling to generate a sample population most suitable for the research population. Sampling is the process of making statistical inferences from a population sample. It divides the whole population into homogenous groups known as strata. Comparable stratified random sampling entails randomly selecting samples from stratified groupings proportional to the population. The subdivisions in disproportionate sampling are not proportionate to the population's incidence. Stratified random sampling varies from ordinary random sampling in that it includes the random selection of data from an entire population, such that every potential sample has an equal opportunity of occurrence.

What is stratified random sampling?

Stratified sampling divides a diverse population into multiple homogeneous subgroups or strata and extracts a sample from each. A "stratum" is a group; the plural form is strata. Thus, stratification is the method of categorizing or grouping things or data. Stratified sampling might be proportional or asymmetric. Proportionate stratified random sampling occurs when samples are captured in identical percentages or ratios from each category. Disproportional stratified random sampling occurs when samples are collected in no predetermined ratio or rate. Because each subgroup is appropriately represented, the stratified sample is more representative of the entire population.

The concept of stratified random sampling

The term stratified was derived from the word "strata," which means "group." As a result, stratified random sampling prioritizes dividing the data into different categories. Each category has variables with comparable characteristics. For evaluation, a sample or data set is chosen from each category.

Please keep in mind that each stratum must be incompatible and comprehensive. In simpler terms, an element or object from one stratum cannot be introduced to another. Data duplication across layers may result in incorrect findings. The primary goal of this strategy is to guarantee that the overall sample contains a diverse range of goods from the population. This combination ensures that the entire population is closely duplicated in the sample.

Suppose a research team assesses an FMCG firm about consumer food tastes and preferences. The team will divide the population into three groups: males, women, and children. The overall number of people needed for the data set is nearly one million. Talking to a million individuals and getting their feedback is not simple. However, it is far easier and time-saving to divide the population into three groups and choose a few individuals from each, say 10% of the population.

The people chosen will represent their community in the sample. Their viewpoint will be shared by the majority of the people in their group. Thus, data segregation sampling ensures that each category or group is adequately represented in the sample. As a consequence, the survey findings will represent the entire population. Investment managers use a typical probability sampling strategy to create client portfolios. It achieves the required returns by duplicating several indexes, such as the stocks or bonds indexes.

It is also a common technique in auditing and vouching. Even an auditor, often a Certified Public Accountant (CPA), will use this technique for vouching and verification while inspecting the company's books. This method suits their requirements well since auditors may build different groups or subgroups based on the sums involved. This method aids in reducing sample size while maintaining the dependability of the sample obtained.

In contrast to regular random sampling, which sees every population participant as equal and with an equal possibility of being sampled, stratified sampling is employed to bring out disparities across groups in a population.

Various stratified random sampling techniques

Stratified random sampling guarantees that every community category is appropriately represented within a research study's overall sample population. Stratification can be proportional or asymmetric.

i. Proportionate stratified sampling

In this case, the same proportion of items is chosen from each stratum. Each stratum's sample size is proportionate to its population. The overall sample size of the entire population is formed by adding the samples from all categories.

ii. Disproportionate stratified sampling

Each stratum's size is not proportional to the population size. Using this random sampling strategy, the researcher does not obtain samples from each group in the same proportion. As a result, in this scenario, the choice of samples may not be reasonable.

The formula for stratified random sampling

There is no specific formula for this sampling since choices such as sub-group or stratum division and total sample size to contemplate the overall population are made at the researcher's judgment.

However, under proportional sampling, the following method may be used to calculate the sample size for each subgroup:

Stratified random sampling = (Total sample size / Entire photograph) * Population subgroups

Real-world illustration

A study published in medRxiv looks at the feasibility of stratified random sampling to estimate COVID-19 incidence in the United States of Maryland. The population of Maryland was stratified or split depending on the counties in this census. Then, from each county, persons representing their strata were chosen. According to the research, the stratified sample approach for determining COVID-19 prevalence is adequate. However, the stratified sample must be modified for inaccurate classification risk to avoid under- or inaccurate estimation of COVID instances.

How to carry out a stratified random sampling in eight phases

The steps for selecting a stratified random sample are as follows:

i. Identify and establish the intended audience.

ii. Determine the number of strata utilized by recognizing the stratification variable or factors. These stratification factors should be consistent with the research's goal. Every extra piece of information influences the stratification factors. For example, if the study aims to comprehend all divisions, the elements will be connected to the subgroups.

iii. Use an existing sample frame or develop one that includes all of the variable stratification information for all components in the target audience.

iv. Make modifications after assessing the sample frame for insufficient coverage, over-coverage, or grouping.

v. When considering the whole population, each stratum should be distinct and cover every member of the population. Differences within the stratum should be minimal. However, each stratum should be vastly distinct from the others. Each member of the population should only belong to one stratum.

vi. Give each element a random, unique number.

vii. Determine the size of each stratum based on your needs. The numerical distribution of all items across all strata will determine the type of sampling to be used. Stratified sampling might be proportionate or disproportional.

viii. The researcher can randomly choose items from each stratum to assemble the sample. A minimum of one item from each stratum must be picked so that there is representation from each stratum, but if two pieces from each stratum are chosen, immediately compute the error margins of the gathered data computation.

This strategy ensures that the sample is typical of the population and lowers sampling error, resulting in better outcomes.

Applications of stratified random sampling

When the population under investigation contains various subgroups and aims to conclude every category or the population as a whole, stratified random sampling should be utilized. This approach is practical when:

i. When the researcher desires to focus primarily on certain strata of the population accessing data, stratified random sampling is a very productive type of sampling. The necessary stratum features may thus be identified in the survey sample.

ii. This sampling strategy is used by researchers when they want to establish a link between two or more distinct strata. The target groups are less likely to be evenly represented if this contrast is undertaken through simple random sampling.

iii. Samples from a group that is difficult to access or contact may be promptly included in the research process by employing the stratified random sampling approach.

iv. Statistical results' reliability is better than regular random sampling since the sample components are drawn from appropriate strata. Diversity within strata will be substantially lower than diversity in the target population. Because of the precision involved, the needed sample size will be substantially less, saving researchers energy and time.

v. Public opinion surveys frequently utilize stratified random sampling to reflect the population in terms of demographic variables such as area, political beliefs, or age.

Stratified random sampling is practical when the population is divided into separate subdivisions, and the purpose is to conclude each segment or the population as a whole.

Contrasting simple and stratified random samples

Both simple and stratified random samples are statistical measuring methods. A simple random sample is employed to represent the entire data population. Depending on shared traits, a stratified random sample splits the population into smaller categories or strata. Stratified sampling, conversely, is more sophisticated, time-demanding, and possibly more expensive to implement than a simple random sample.

The simple random sample is frequently employed when there is limited data about the data population, when a population of data has a lot of variations to divide into several subsets, or when the population has just one distinguishing attribute.

Cluster sampling versus stratified sampling

Cluster sampling and stratified sampling are both methods of random sampling. Various subgroups are produced in stratified random sampling with elements with the same properties. Following this division, samples are drawn from each stratum to reflect the population mix.

Cluster sampling, on the other hand, is the method of separating the total population into subgroups. Nevertheless, heterogeneous groups arise, with each cluster comprising various objects with varying qualities. In this procedure, a random cluster (or clusters) is picked, and the constituents of that cluster constitute the final sample. The cluster is used as a sample because it duplicates the entire diverse population.

The former refers to the groups as strata, whereas the latter refers to them as clusters. Furthermore, in stratified sampling, the sample is the strata's components, but a cluster or subgroup is considered a sample in cluster sampling. In stratified random sampling, the researcher creates heterogeneous layers containing homogeneous elements. However, the researcher creates homogeneous clusters from diverse materials in the latter.

The benefits of stratified random sampling

This sampling strategy is quite successful for the following motives:

i. Generates heterogeneous population

Creating subgroups is an effective way to pick a sample with diverse characteristics. As a result, the samples from each subgroup or stratum represent the total population.

ii. Fair evaluation

It comprises samples with diverse data, providing each category with a suitable weightage for impartial interpretation.

iii. Efficient and reliable outcomes

Samples drawn equitably from every class or group with diverse qualities produce efficient and valuable results.

iv. Conserves time and money

Analyzing the entire population is time-consuming and wasteful. At the same time, this approach aids in selecting a very close sample of a much smaller size, saving the researcher time and money.

v. Allows for comparative study

Its characteristics divide the whole population into separate strata. As a result, the data from each of these groups may be compared and examined independently.

The drawbacks of stratified random sampling

Without a doubt, this random sample approach streamlines the research or analysis procedure. It is, nevertheless, prone to mistakes and inaccuracies. Let us look at some of the constraints that limit its applicability:

i. Limited scope

This approach becomes invalid in the lack of aggregated information on the population's different traits and mix. As a result, it cannot be used in all types of research.

ii. Difficulty in choosing strata

The development of categories or groupings is another vital issue. Another problem is determining what to include or eliminate and which traits to examine.

iii. Unsuitable for small population size

When the population size is small, sampling is unnecessary. Alternatively, the entire population might be considered for study.

iv. Bias

This approach is heavily impacted by the researcher's group selection, which may not always be fair. Furthermore, people's mindsets and talents differ, which may alter the sample.

Conclusion

Stratified sampling is a form of random sampling that divides the population into subcategories or strata and draws a random sample from each. Each subsection or stratum is made up of objects that share attributes. This sampling technique is commonly employed in human research and political polls. It differs from simple random sampling, which takes a sample from any section of the whole population. Furthermore, it is a dependable approach since the objects with varied characteristics are evenly picked to produce a sample representing the entire population. As a result, such a sample is more representative of the total population.