6. Samples and Populations

We have used a sample of data to derive two simple linear regression models that estimate Green Leaf Area Index (GLAI) from Ratio Vegetation Index (RVI) data.

Winter Wheat: GLAI = 0.162576·RVI + 0.2882
Spring Barley: GLAI = 0.254203·RVI - 0.4335

Now GLAI can never be negative, yet the Spring Barley equation could give us negative values if the RVI value is low enough. The sample is not representative of populations that contain low values of RVI.

A sample is a set of members from a population selected so as to represent that population.

A sample is smaller than the population, often much smaller. They must be chosen in such a way as to be representative, that is so that you can take measurements from the sample so as to estimate those characteristics of the population. Sampling is the process of taking a sample from a population and statistics are often about analysing a sample so as to draw conclusions about the population. Sampling is usually conducted to either reduce a problem to manageable proportions, to reduce cost or to reduce the time used.

A population consists of all of those members who share specific attributes that define the boundaries of the population.

Population boundaries can be defined in many ways; the typical way is by those who live in a geographic area. Thus the population of a country is all of those people who live within a defined geographic area. But populations can be defined in different ways; it might be all female members of a species for example, or it may be all twins. A set of specifications set the criteria by which a population can be defined. How you define a population often depends on the question that you wish to ask.

bacteria in a petri dish
Bacteria in a petri dish.
Photo: ???

The bacteria in this Petri dish represent the population in the Petri dish, but they may well be a sample from a much larger population. Depending on the question that you wish to ask, you may need to sample the bacteria in this Petri dish. But the Petri dish may well represent one sample from an even larger population. If the sample you take within the Petri dish does not represent the Petri dish population, then it is very unlikely to represent the larger population.

If sampling is designed to reduce the magnitude of a problem, then the sample should be as small as possible. But you cannot take a sample of one; with a sample of one the answer will be correct or wrong, whereas the reality is much more subtle. There is a minimum size that a sample can be so as to ensure that the sample is representative of the population.

Questions

  1. What is the population if you want to select a site for a shop that is to sell extra large women's clothes in your country?

  2. Having selected a site, what is the population to assess the demand for clothes from the shop?