develop a research proposal

Develop a Research Proposal


Data Collection - Sampling

Sampling can be a somewhat complicated concept.  Review this basic explanation first.  It explains the special role that sampling plays in qualitative and quantitative research design and also does an excellent job discussing variables.  Pay special attention to the descriptions that appear in the charts and the illustrative examples on the page below.  You will need to make informed decisions for your own methodology, taking into account your research questions or hypothesis, your procedures, and the feasibility of your ideas.

Collecting data is time consuming and expensive, even for relatively small amounts of data. Hence, it is highly unlikely that a complete population will be investigated. Because of the time and cost elements the amount of data you collect will be limited and the number of people or organizations you contact will be small in number. You will, therefore, have to take a sample and usually a small sample.

Sampling theory says a correctly taken sample of an appropriate size will yield results that can be applied to the population as a whole. There is a lot in this statement but the two fundamental questions to ensure generalization are:

  1. How is a sample taken correctly?
  2. How big should the sample be?

A population is the group that you want to study for your investigation, and about which you will make a conclusion.   Because you cannot always interview or survey an entire population because it is too large,  or because you cannot reach all members to try out a new medicine, etc., you pick a sample , a smaller representative group, from which you will make generalizations about the population.  For example, you and four of your classmates can be a sample of your class, a sample of your high school's students, a sample of your country and so on. You can define a sample as a more concrete portion of a population or populations that you choose to represent.  A sampling frame is the largest sample that can be obtained from a population.

samp;ing illustration

The term sampling in qualitative designs can be used in two different ways:

  1. At first, it is the group of individuals (sample) chosen to represent a larger group (population) for study in an investigation of a research question.
  2. Secondly, after an investigation is conducted and when a hypothesis made, it is additional individuals selected at random from a study's population used to test the hypothesis.   

Below is a brief overview of how to create samples and why you would choose them.  A more detailed list of types of samples can provide more specific types of samples when you should need them later.  As an investigation progresses, often new types of samples need to be create for specific purposes.  The link above will assist you as you move further into developing a hypothesis and need more refined methods for sampling.

As you read, consider which one(s) might be appropriate for your research investigation. 

Probability sampling:
Probability sampling is a technique used to ensure that every element in a sample frame has an equal chance of being incorporated into the sample. How can we then ensure that we have a chance of including every element in a sample frame into our sample? Not an easy task… ah! Below are some ways by which we can try to accomplish probability sampling. Each has its own advantages and limitations associated with them. However, you can combine different probability sampling procedures in order to obtain the sample you want as long as you limit yourself to random selection procedures.  

Different types of random sample procedures: 


Simple random sampling  

In simple random sampling, you use an unsystematic random selection process (i.e. you identify every characteristic you want to represent of the population in the sample group, then choose sample members on some planned basis, ensuring that every member has the same opportunity of being selected). What if the sample frame is too large? You can use a random number table. This is a table containing random numbers. In order to use the table, you will need to determine the size of your sample frame and the largest number in your sample frame has to be included into the table. Look at this table below with three columns and nine rows.  

 24356   46724   25641   67514   98257   98165   35678   87192   98173   
 17625   78256   71522   98127   09161   01823   91728   56720   09765   
 28937   97628   09152   61723   91873   91723   87542   19782   25637   

Suppose you create some fictitious numbers of 45 students in a class, and you want to get the average math score for seven students. How do you go about choosing the seven students using the random sample table above?  The first step is to decide how to move in the columns and rows: either up or down but this has to be systematic. Second, choose any starting point to select a sample of seven from the sampling frame of 45. Assume that you are using the first two digits of the number. Take column two row seven. The numbers are 9 and 1. You are already in a problem. 91 is too large than 45. Take column one row three. The number is 2 and 5. 25 is within the range of 45. This will be our first number to use from the sample frame of 45.  You will then continue until you get to the seven students you needed.   There are more strategies for simple random sample described here


Systematic random sampling

Systematic random sampling is done through some ordered criteria by choosing elements from a randomly arranged sampling frame. You can chose from every “nth” element in a sample frame i.e. 10th, 15th, 20th and so on. What are the procedures involved? You have to decide on your sample size. Say your sample size will be made up of 30 students from a sample frame of 400 students; you may want a proportion of 30/400 = 0.075, which is every 13th person. Here is more information about systemic sampling.

In both the simple random sampling and the systematic sampling you will be required to generate a list from the sampling frame. It also requires that the elements within the sampling frame be homogeneous, which means the same.

There are also times when you may need elements of the sample to be different within the group.  Then you would use:


Stratified random sampling  

When dealing with a sample frame that is not homogeneous and contains subgroups such as freshmen, juniors, and so on in a listing of university students for instance, you will need to represent those subgroups in your sample. In order to achieve this, random selection from each subgroup in the sampling frame has to be considered using the same procedures. The subgroups within the sample frame will have to be treated as though they are separate sampling frames themselves.


Cluster sampling

Have you ever imagined a situation whereby a sampling frame list is unavailable and as a researcher you have to continue with your work? For instance, how would you go about obtaining a sample frame list of all college students in the United States? Hard! Would you abandon carrying your research on this basis? If I were you I would say a big NO! Why? Through cluster sampling, you can randomly select hierarchical groups from the sampling frame by creating clusters that can be further sampled into finer categories of clusters until we can obtain a list of elements. 

Here's a scenario that will illustrate how cluster sampling is achieved.  To select a random sample of 600 college students in the United States, for instance, you create a sample frame consisting of a list of colleges. Using the simple random sampling, you can select six colleges from the list. Then from the clusters, you can randomly select 100 students from the six colleges to obtain 600 students residing in the United States. See the layers of organization in the samples?

Non probability Sampling
Non probability sampling is any procedure in which elements will not have the equal opportunities of being included in a sample. In non-probability sampling, you set criteria for elements to be included in the sample i.e. on basis of region, appearance and so forth hence limiting the chances of representation in the sample. The simple question one would be obliged to ask is why bother with non-probability sampling when we already know that there exists bias in sample selection? Non probability sampling equally plays a major role in the field of explanatory research. However one major feature of non-probability sampling is that it does not use random sampling and therefore you cannot estimate sampling error. 

We will now look at some examples of non-probability sampling used in research.


Accidental sampling  

This is selection based on availability or ease of inclusion. Assume that you were walking down the street and an interviewer chose to videotape you for the evening local news broadcast. Can this be termed accidental or random sampling? I would argue strongly for accidental sampling because this was a mere selection based on your availability and willingness to talk. Accidental sampling can lead to misinterpretation of results. Do you recall the outcome of the 1936 U.S. presidential elections? OK, so you were not yet born by that time. Try to find this publication and see what results accidental sampling can produce.


Purposive sampling

In exploratory or pilot projects you may be purposely inclined to obtain data from specific individuals. Such data may give you internal and external validity of your project, but you may not be able to generalize it to other places and people.  Your conclusions can only be made about the sample group you collected data from, and you cannot extend your conclusions to other groups or the larger population.


Quota sampling

In quota sampling, you select sampling elements on the basis of categories that are assumed to exist within a population. How is quota sampling different from stratified random sampling discussed earlier on? In the former, elements are randomly selected from stratified groups while in the latter a presumed subdivision is used as the bases of the selection procedure. Although in quota sampling the results may almost reflect similarities with the population, there is difficulty in determining the amount of sample error. Do you remember the famous 1948 photograph of Harry Truman holding a newspaper with the following infamous headline…DEWEY DEFEATS TRUMAN? Find out more about this story and reflect on how quota sampling may lead to our inability to generalize to a population. 

Please keep in mind that a strong research design and analytical approach will:

  • Incorporate more than one of the sampling strategies described above.
  • Include an sampling approach whereby the research team moves back and forth (iterating) between sampling and analyzing data such that preliminary analytical findings shape subsequent sampling choices.  The process would be sample-study-analysis and reforming ideas, then sample-study-analysis and reforming ideas, on and on until a clear understanding is reached.

Here is an interesting sampling simulation for an illustrative example.

Here are some useful sampling tools:

Remember, that the tools that will be discussed in the next step in the research design path will need sampling procedures to provide subjects for study.  The validity and reliability of your results will rely on your sampling methods.  Ethics are also a consideration for your sampling procedures.  The decisions that you make here will be a part of the foundation of your methodology. 

Here is a section from a research proposal section that explains the sampling used and the rationale.  Use it as an example for explanations and phrasing.

Sample Proposal Description of a Sampling Strategy

Population and Sampling Plan

The adult home is not a health care facility, but rather a residential setting for independent older adults who require only minimal services, such as assistance with house keeping, one or two daily meals, and transportation to meet medical, grocery shopping, and other needs. It has been determined that 271 people, 65 years of age or older, live in this setting. In addition, 346 people live in the 220 apartments contained within the senior citizen complex. Some tenants may receive special services based on income and all are eligible for various social and community activities, but all are independent in terms of transportation, meeting medical needs, and involvement with others throughout the community.

A random sample will be drawn from the list of residents obtained for both settings. Using a table of random numbers, the names of individuals will be selected from each setting until a minimum of 110 people in each setting is obtained. It is anticipated that if fewer than 95 people per setting initially respond to the instruments described below, names will continue to be drawn from the remaining individuals until at least 95 people from each setting have completed the two forms. It is hoped that at least 100 people from each site will complete the forms.

It is expected that obtaining a minimum of 190 people as described in the previous paragraph will result in a good cross section of subjects in terms of gender, age, and residential setting. In addition, the normal variations in life satisfaction SDLRS scores among at least 190 people will enable statistical comparisons for the study’s hypotheses that provide new information about older adults.

Now go to your planning guide to make and record decisions about your own sampling for your quantitative investigation.

Return to Data Collection