Data Collection & Sample Types [Statistics]

in STEMGeeks7 hours ago

Hi there. In this post, I provide a short overview of sample types when it comes to data collection and statistics.

A big part of data analysis, statistics, business intelligence and artificial intelligence is data collection. Data is collected for analysis and to find insights on items or people. Whatever is found from the data can be used for decision making purposes.

A population refers to the selected group of interest that researchers want to know about. As populations can be really big, researchers take a sample from the population. Samples should be as large as possible but there are time constraints and budget concerns.

 

Topics


  • Simple Random Sampling
  • Stratified Sampling
  • Cluster Sampling
  • Systematic Sampling
  • Convenience Sampling
  • Voluntary Sampling

 

There are more sampling techniques but I cover these ones.

 

Simple Random Sampling


With simple random sampling or SRS for short, each person or object in a population has an equal chance of being selected. The selected people or items would be studied for information gain.


Image Source

 

Stratified Sampling


Stratified sampling involves separating the population into groups or strata. Then you choose a certain number of people items from each of the strata.

As an example, you select ten people from male athletes at a school and then ten female athletes. This is preferable over selecting 20 random athletes where there may be an unequal amount of male and females.

Another example can be stratified sampling over regions. Five representatives would be chosen from each of the thirty districts for a study.

Image Source

 

Cluster Sampling


Cluster sampling is somewhat similar to stratified sampling. With cluster sampling, the population of interest is divided into smaller groups. Then you randomly select a few clusters for research purposes.


Image Source

 

Systematic Sampling


Systematic sampling is a simple sampling method where participants are selected at fixed intervals. One example is selecting every fifth member from 1800 people.


Image Source

 

Convenience Sampling


Convenience Sampling is a sampling method where researchers select participants that are easy to reach. If researchers want to survey people about the cost of food then they would ask random people outside a grocery store.

This sampling method is quick, cost-effective but there is bias to this. Surveying people outside a Walmart would lead to different answers to surveying people outside a Costco. There would be a small selection bias from the researcher on where the survey is taken place.


Image Source

 

Voluntary Sampling


Voluntary sampling is where particpants choose to take part in a study or survey. Instead of researchers finding particpants, the researchers could invite or advertise a study.

These studies can be fast and cost-effective. One big problem is that it favours those who want to particpate in the study. Some people may not even know of such a study.

Examples of voluntary sampling include online surveys for rewards and clinical trial studies that pay participants with certain attributes for completing tasks over a few days, weeks or months.


Image Source

 

Closing Notes/Summary


There is no best sampling method for all scenarios. Some researchers would prefer one over the other based on time constraints, financial budgets and the ability to acquire people for studies.

The goal for sampling is to gain as much information as possible. The amount of people sampled is almost always not even close to the population. You can never gather information from everyone. With that in mind, the data obtained will never be perfect. Some information is better than none when it comes to data-driven insights and decision making.

You also have ethics and privacy concerns when it comes to data collection. Some people may not want to participate in research studies. Some may get angry if information is gathered without their consent.

 

Thank you for reading.