How to Collect Data? Population or Sample?
You can use two approaches while making the statistical analysis for your research. You can analyze the entire population or the sample extracted from it. Remember that the word ‘population’ refers not only to people but to other elements you would like to study - objects, trends, species, countries, etc.
A population is used for the whole group of somebody or something, while a sample is just a part of the population you want to investigate. The sample size depends on the complexity of the research and the overall size of the population.
Examples of Population | Examples of Sample |
---|---|
✔️ All low-income high-school math students in Ontario. | ✔️ 300 low-income high-school math students from Ontario with experience of taking math exams. |
✔️ All countries of Asia. | ✔️ All developing countries of Asia with a favorable demographic situation. |
How to Collect Data from a Population
This method is used when you need to get data from every member of the population required by the aim of the research and when you have good access to all of them. You can do it only if the population is not big and cooperative.
The educational authorities in Ontario have decided to collect the data on the math exam results from three high schools to apply their findings. It has been done to provide the instructions for further exam procedure improvement.
When the population is rather large or dispersed, you will not be able to collect data from every individual.
You want to research the effect of low income on the math exam results of high-school students in Ontario. However, many people from low-income groups will not be eager to answer your questions or participate in the survey. The set result you will receive will be incomplete.
That is why it is better to involve the volunteer students who agree to do the survey or choose some students from every school to answer the questions. In this case, the results will be less biased and more proportionate to the total number of low-income students who have difficulties passing their math exams.
How to Collect Data from a Sample
You can use sample data for statistical analysis when the population is too large or unavailable. You can make estimates and test your hypotheses adequately by using that method.
You need to check the preferences of the high school graduates in Arizona for their future careers. The overall number of graduates in 2022 is about 75,000 people. You will not find it too practical to collect the data from all of them because many individuals are not ready to answer your questions, and others are indifferent to their choice or to the possible survey. It makes sense here to form a group of 250 people who will volunteer to participate in your survey.
It is always better, of course, if the sample is selected on a random basis and you can use stratified or probability sampling approaches. In this way, you can reduce the risk of bias and enrich the internal and external validity of the results.
However, non-probability sampling is used more often because it is more practical. The participants are chosen according to certain criteria and are cheaper and more convenient to get involved in. The statistical inferences here can be weaker because only those individuals who want to participate in the research are involved.
Why You Need Sampling
There are several reasons why you may need sampling. Among them are the following:
- Practicality. It is always easier and more convenient to collect data from a sample.
- Manageability. You can store and operate with smaller amounts of data and the outcomes are more reliable.
- Necessity. Very often, the whole population can be inaccessible, or it can be so large in size that doing surveys does not seem possible.
- Cost-effectiveness. You will not need so many costs, laboratory capacities, helpers, participants, and pieces of equipment for doing the surveys in a sample.
What to Consider - Sample Statistic or Population Parameter?
You use different measurements when you collect data from the entire population and from the sample. The whole population uses the parameter as a measure, while the sample results can be described in a statistic.
When you want to do hypothesis testing or estimation, you need to know how much the sample statistic is different from the population parameter.
You ask your survey participants from the sample whether their level of income has ever influenced their math exam results. You offer the scale from 1 to 10 and ask them to assess the influence where 1 is minimal and 10 is maximal. You receive the results of 4.2, and it is your sample statistic.
Based on it, you can assume that the income level does not considerably affect the exam results in the entire population. It means that you have made a scientific guess about the relevant population parameter - low income has a minor influence on the results of math exams in Ontario high-school graduates.
Sampling Error
However, you may face a sufficient difference between the assumed population parameter and the sample statistic obtained from the survey. As a result, it may turn out that your 250 volunteers have not felt much pressure from their low income, while preparing and taking their math exam while the others have been completely frustrated with their situation, could not concentrate on the preparation, had to work part-time and have not had time to prepare properly and experienced many other distracting factors.
The error happened because you used a sample that was randomly selected - only those students who wanted to take part in the survey, without any other criteria. You do not even know whether all of them were those with dramatically poor incomes. That shows that random samples are not identical to the whole population, so the numerical measures - standard deviations and means - can cause a big difference and, therefore, a sampling error.
Of course, while conducting the research, you need to diminish the sampling error. You can do it by increasing your sample size or defining some other criteria for its choice.
Final Thoughts
Now, you can collect the data for your analysis by using the two approaches - via the entire population or samples of this population under research. The choice of the approach may depend on its complexity, the amount of data needed for hypothesis testing, and the overall goals and objectives of the research. You can always choose the approach following your needs or just use both to make a scientific comparison at the end if it is available.