# The Population Mean – Explanation & Examples

The definition of the population mean is:

“The population mean is the mean or average found in a population.”

In this topic, we will discuss the population mean from the following aspects:

• What is the population mean?
• How to find the population mean?
• The population mean formula.
• The role of the population mean.
• Practice questions.

## What is the population mean?

The population mean is the mean value of a numerical characteristic of the population. The population is the whole group of items we want to study. These items can be individuals, things, animals, plants, etc.

For example, the whole individuals living in the U.S., the whole chairs produced from a certain factory, the whole tigers living in rain forests in Indonesia, and the whole orange trees in Egypt.

These different populations’ numerical characteristics can be weights for individuals, leg lengths for chairs, tail lengths for tigers, and the heights for orange trees.

However, collecting information from the population may not be possible in many cases due to the great resources it needs.

For example, if we want to study the heights of American males. We can survey every American male and get his height. This is population data.

Alternatively, we can select 200 American males and measure their heights. This is sample data.

If we calculate the mean of the population data, its symbol is the Greek letter μ and pronounced “mu.”

## How to find the population mean?

We have two cases:

1. We have population data and so calculate the population mean from it.
2. We have sample data and use the sample mean to construct an interval that most likely contains the population mean.

### – Examples of population data

#### – Example 1

The following is the murder rate (per 100,000 population) for the 50 states of the U.S. in 1976. What is the mean of the murder rate?

We have information about all states of the U.S. so this is population data.

#### Note

This dataset may be considered as a sample or as a population.

It is a sample if we want to study the murder rate of the U.S. states in the 1970s, or it is a population if data is from 1970-1980 because this is a 1-year sample from these 10 years.

It is a population for the murder rate of U.S. states in 1976.

 state murder rate Alabama 15.1 Alaska 11.3 Arizona 7.8 Arkansas 10.1 California 10.3 Colorado 6.8 Connecticut 3.1 Delaware 6.2 Florida 10.7 Georgia 13.9 Hawaii 6.2 Idaho 5.3 Illinois 10.3 Indiana 7.1 Iowa 2.3 Kansas 4.5 Kentucky 10.6 Louisiana 13.2 Maine 2.7 Maryland 8.5 Massachusetts 3.3 Michigan 11.1 Minnesota 2.3 Mississippi 12.5 Missouri 9.3 Montana 5.0 Nebraska 2.9 Nevada 11.5 New Hampshire 3.3 New Jersey 5.2 New Mexico 9.7 New York 10.9 North Carolina 11.1 North Dakota 1.4 Ohio 7.4 Oklahoma 6.4 Oregon 4.2 Pennsylvania 6.1 Rhode Island 2.4 South Carolina 11.6 South Dakota 1.7 Tennessee 11.0 Texas 12.2 Utah 4.5 Vermont 5.5 Virginia 9.5 Washington 4.3 West Virginia 6.7 Wisconsin 3.0 Wyoming 6.9

1. Add up all of the numbers:

15.1+ 11.3+ 7.8+ 10.1+ 10.3+ 6.8+ 3.1+ 6.2+ 10.7+ 13.9+ 6.2+ 5.3+ 10.3+ 7.1+ 2.3+ 4.5+ 10.6+ 13.2+ 2.7+ 8.5+ 3.3+ 11.1+ 2.3+ 12.5+ 9.3+ 5.0+ 2.9+ 11.5+ 3.3+ 5.2+ 9.7+ 10.9+ 11.1+ 1.4+ 7.4+ 6.4+ 4.2+ 6.1+ 2.4+ 11.6+ 1.7+ 11.0+ 12.2+ 4.5+ 5.5+ 9.5+ 4.3+ 6.7+ 3.0+ 6.9 = 368.9.

2. Count the numbers of items in your population. In this population, there are 50 items or 50 states.

3. Divide the number you found in Step 1 by the number you found in Step 2.

The population mean = 368.9/50 = 7.378.

Note that the population mean has the same unit as the original data. So 7.378 is the mean murder rate per 100,000 population.

#### – Example 2

The following are the weights (in grams) for 71 chickens on a certain farm. What is the mean?

We have the weights of all chickens on the farm, so this is population data.

 chicken number weight 1 179 2 160 3 136 4 227 5 217 6 168 7 108 8 124 9 143 10 140 11 309 12 229 13 181 14 141 15 260 16 203 17 148 18 169 19 213 20 257 21 244 22 271 23 243 24 230 25 248 26 327 27 329 28 250 29 193 30 271 31 316 32 267 33 199 34 171 35 158 36 248 37 423 38 340 39 392 40 339 41 341 42 226 43 320 44 295 45 334 46 322 47 297 48 318 49 325 50 257 51 303 52 315 53 380 54 153 55 263 56 242 57 206 58 344 59 258 60 368 61 390 62 379 63 260 64 404 65 318 66 352 67 359 68 216 69 222 70 283 71 332

1. Add up all of the numbers:

179+ 160+ 136+ 227+ 217+ 168+ 108+ 124+ 143+ 140+ 309+ 229+ 181+ 141+ 260+ 203+ 148+ 169+ 213+ 257+ 244+ 271+ 243+ 230+ 248+ 327+ 329+ 250+ 193+ 271+ 316+ 267+ 199+ 171+ 158+ 248+ 423+ 340+ 392+ 339+ 341+ 226+ 320+ 295+ 334+ 322+ 297+ 318+ 325+ 257+ 303+ 315+ 380+ 153+ 263+ 242+ 206+ 344+ 258+ 368+ 390+ 379+ 260+ 404+ 318+ 352+ 359+ 216+ 222+ 283+ 332 = 18553.

2. Count the numbers of items in your population. In this population, there are 71 items or chickens.
3. Divide the number you found in Step 1 by the number you found in Step 2.

The population mean = 18553/71 = 261.3 grams.

#### – Example 3

The following is the trunk circumference (in mm) for 35 orange trees on a certain farm. What is the mean?

We have the trunk circumferences of all trees on the farm, so this is population data.

 tree number circumference 1 30 2 58 3 87 4 115 5 120 6 142 7 145 8 33 9 69 10 111 11 156 12 172 13 203 14 203 15 30 16 51 17 75 18 108 19 115 20 139 21 140 22 32 23 62 24 112 25 167 26 179 27 209 28 214 29 30 30 49 31 81 32 125 33 142 34 174 35 177

1. Add up all of the numbers:

30+ 58+ 87+ 115+ 120+ 142+ 145+ 33+ 69+ 111+ 156+ 172+ 203+ 203+ 30+ 51+ 75+ 108+ 115+ 139+ 140+ 32+ 62+ 112+ 167+ 179+ 209+ 214+ 30+ 49+ 81+ 125+ 142+ 174+ 177 = 4055.

2. Count the numbers of items in your population. In this population, there are 35 items or trees.

3. Divide the number you found in Step 1 by the number you found in Step 2.

The population mean = 4055/35 = 115.8571 mm.

### – Examples of sample data

For samples with a size greater than 30, the interval that, most likely, contains the population mean is calculated by:

¯x±1.96Xs/√n

Where:

¯x is the calculated sample mean.

s is the standard deviation of the sample. It is a measure of the data spread.

n is the sample size.

This interval (called 95% confidence interval) gives us a range of possible values for the unknown population mean from which the sample was taken.

#### – Example 4

The following is the age (in years) of 50 randomly selected individuals from a certain population. If you know that this sample’s standard deviation is 18.65, construct a 95% confidence interval for the true population mean.

89 61 74 85 46 60 41 18 37 30 44 37 51 53 74 38 56 48 52 62 33 56 38 30 43 32 74 27 49 53 40 27 42 60 88 22 59 43 69 75 28 47 35 62 65 31 22 31 26 83.

89+ 61+ 74+ 85+ 46+ 60+ 41+ 18+ 37+ 30+ 44+ 37+ 51+ 53+ 74+ 38+ 56+ 48+ 52+ 62+ 33+ 56+ 38+ 30+ 43+ 32+ 74+ 27+ 49+ 53+ 40+ 27+ 42+ 60+ 88+ 22+ 59+ 43+ 69+ 75+ 28+ 47+ 35+ 62+ 65+ 31+ 22+ 31+ 26+ 83 = 2446.

2. Count the numbers of items in your sample. In this sample, there are 50 items or persons.

3. Divide the number you found in Step 1 by the number you found in Step 2.

The sample mean = 2446/50 = 48.92 years.

4. The 95% confidence interval is:

¯x±1.96Xs/√n

¯x-1.96Xs/√n to ¯x+1.96Xs/√n

48.92-1.96X18.65/√50 to 48.92+1.96X18.65/√50 or 43.75 to 54.1.

It means that the true population mean age can be as small as 43.75 years and as large as 54.1 years.

Owing to the presence of the √n term in the formula for an interval calculation, the sample size affects the interval width. Larger sample sizes lead to smaller interval widths (or a more precise estimate of the population mean).

Suppose that you have a 100 sample size and you obtain the same sample mean and standard deviation; the 95% confidence interval will be:

48.92-1.96X18.65/√100 to 48.92+1.96X18.65/√100 or 45.26 to 52.6.

Suppose that you have a 500 sample size and you obtain the same sample mean and standard deviation; the 95% confidence interval will be:

48.92-1.96X18.65/√500 to 48.92+1.96X18.65/√500 or 47.29 to 50.55.

With increasing the sample size, you have more values about the true population mean.

We can see that in the following figure.

Increasing the sample size has led to a narrower confidence interval.

#### Plotting the confidence intervals for different random samples

We have a population data of more than 20,000 individuals. We know that the true population mean for the age of these 20,000 individuals is 47.18 years.

Using a computer program, we will take 50 random samples from this population, each of size 35, and calculate the mean’s confidence interval for each sample.

 sample mean lower upper 1 44.71 39.98 49.45 2 48.43 41.72 55.14 3 46.66 40.25 53.06 4 48.43 42.17 54.68 5 48.77 42.26 55.28 6 47.14 42.04 52.25 7 46.74 40.67 52.81 8 51.49 44.65 58.32 9 49.20 42.29 56.11 10 50.14 43.71 56.58 11 44.23 37.17 51.29 12 49.37 42.91 55.83 13 40.97 35.80 46.15 14 50.06 44.05 56.07 15 46.09 40.22 51.95 16 40.60 35.55 45.65 17 47.80 41.69 53.91 18 49.89 43.17 56.60 19 47.40 42.58 52.22 20 51.17 44.40 57.94 21 45.20 39.59 50.81 22 48.29 42.30 54.27 23 48.26 42.87 53.65 24 42.91 37.93 47.89 25 42.49 36.14 48.83 26 47.86 43.42 52.30 27 43.66 38.07 49.24 28 43.14 37.61 48.68 29 45.54 39.69 51.40 30 48.51 42.45 54.58 31 45.57 38.80 52.34 32 46.11 40.18 52.05 33 45.03 39.37 50.69 34 47.71 41.44 53.99 35 49.14 42.61 55.67 36 46.37 40.30 52.44 37 47.91 41.95 53.87 38 46.91 41.16 52.67 39 49.77 44.45 55.09 40 40.86 35.95 45.77 41 49.00 42.68 55.32 42 45.43 40.00 50.85 43 50.83 44.84 56.81 44 45.77 39.29 52.25 45 44.89 39.27 50.50 46 43.43 37.98 48.88 47 43.43 37.78 49.08 48 51.26 45.15 57.36 49 48.94 43.31 54.57 50 46.43 39.95 52.91

Add another column to the table to indicate if the interval covers the true population mean.

The interval will not cover the population mean if the lower limit is higher than the population mean or the upper limit is lower than the population mean.

 sample mean lower upper covers population mean 1 44.71 39.98 49.45 yes 2 48.43 41.72 55.14 yes 3 46.66 40.25 53.06 yes 4 48.43 42.17 54.68 yes 5 48.77 42.26 55.28 yes 6 47.14 42.04 52.25 yes 7 46.74 40.67 52.81 yes 8 51.49 44.65 58.32 yes 9 49.20 42.29 56.11 yes 10 50.14 43.71 56.58 yes 11 44.23 37.17 51.29 yes 12 49.37 42.91 55.83 yes 13 40.97 35.80 46.15 no 14 50.06 44.05 56.07 yes 15 46.09 40.22 51.95 yes 16 40.60 35.55 45.65 no 17 47.80 41.69 53.91 yes 18 49.89 43.17 56.60 yes 19 47.40 42.58 52.22 yes 20 51.17 44.40 57.94 yes 21 45.20 39.59 50.81 yes 22 48.29 42.30 54.27 yes 23 48.26 42.87 53.65 yes 24 42.91 37.93 47.89 yes 25 42.49 36.14 48.83 yes 26 47.86 43.42 52.30 yes 27 43.66 38.07 49.24 yes 28 43.14 37.61 48.68 yes 29 45.54 39.69 51.40 yes 30 48.51 42.45 54.58 yes 31 45.57 38.80 52.34 yes 32 46.11 40.18 52.05 yes 33 45.03 39.37 50.69 yes 34 47.71 41.44 53.99 yes 35 49.14 42.61 55.67 yes 36 46.37 40.30 52.44 yes 37 47.91 41.95 53.87 yes 38 46.91 41.16 52.67 yes 39 49.77 44.45 55.09 yes 40 40.86 35.95 45.77 no 41 49.00 42.68 55.32 yes 42 45.43 40.00 50.85 yes 43 50.83 44.84 56.81 yes 44 45.77 39.29 52.25 yes 45 44.89 39.27 50.50 yes 46 43.43 37.98 48.88 yes 47 43.43 37.78 49.08 yes 48 51.26 45.15 57.36 yes 49 48.94 43.31 54.57 yes 50 46.43 39.95 52.91 yes

Then, we will plot these confidence intervals to see how they cover the true population mean.

The horizontal black line represents the true population mean.

The blue or red dots represent the mean calculated from each random sample.

The error bars represent the confidence interval calculated from each sample.

We see only 3 samples create a confidence interval that does not cover the population mean and accounting for 3/50 = 0.06 or 6%. Hence, it is very unlikely that the created confidence interval does not cover the population mean.

In research, we take only 1 sample and construct a 95% confidence interval from it.
We do not know if our sample yields an interval containing the population value or one of the bad samples that yield an interval not containing the population value.

However, the constructed 95% confidence interval works 95% of the time. We can assume that the constructed interval has the population value.

## Population mean formula

The population mean formula is:

μ=1/N∑X

Where μ is the population mean.

N is the population size.

∑X means the sum of every element in our population.

We used this formula in the above examples, where we summed the population data and divided it by the population size (or multiplied by 1/N).

## The role of the population mean

The population mean gives us a measure of the center or central tendency of the population data.

This is useful when comparing different populations to each other.

### – Example 1

The following table is the mean age for certain male and female populations.

 sex mean age Male 54.78 Female 54.69

We see that the mean age is nearly similar for males and females.

It means that the ages for males and females are similar in their distributions. We can see that from plotting the ages for both populations as histograms.

The black vertical line represents the mean for each group.

We can also plot the data as a dot plot to see individual values.

We see the same range for male and female ages.

### – Example 2

The following table is the mean height for certain male and female populations.

 sex mean height Male 169.27 Female 156.99

We see that the mean height for males is higher than the mean height for females.

Without looking at the data, this male population’s heights are longer than the heights of this female population on average.

We can see that from plotting the heights for the 2 populations as histograms.

The black vertical line represents the mean for each group.

We can also plot the data as a dot plot to see individual values.

We see that males’ height tends to be longer than females’ height.

But what about the sample means?

The sample means also give information about the population data center from which the sample was collected.

However, we have to calculate the 95% confidence interval for each sample mean due to sampling error. Suppose the confidence intervals do not overlap for the two samples. In that case, we can deduce that the two populations (from which the two samples were taken) are significantly different from each other.

### – Example 3

The following table is the birth weight of neonates (in grams) for a random sample of 35 smoking and 35 never-smoking mothers.

 smoking Never-smoking 3374 3090 1928 3651 2977 3572 2410 2733 3884 2450 1928 2495 3940 1928 2466 3232 3203 1474 2187 3203 3317 1729 2381 3860 2821 3912 3321 3983 3629 3770 2782 2835 3260 4153 2769 1970 2466 1588 2353 3062 3042 2353 3637 2977 2466 4054 3005 3274 2414 3459 3651 1588 2948 2495 2977 2877 3430 2722 2126 2240 2367 3234 2125 3175 3856 4111 2906 3544 3132 2301

The mean birth weight for the smoking mothers = 2899 grams, while the mean for never-smoking mothers = 2946 grams.

The standard deviation for smoking mothers’ weight = 576.72 grams, while the standard deviation for never-smoking mothers = 783.22 grams.

While the sample mean for never-smoking mothers is larger than that for smoking mothers, one may argue that this difference is due to sampling error and not due to the true difference between the population of smoking and never-smoking mothers.

To overcome that, we construct a 95% confidence interval for each mean:

The 95% confidence interval for smoking mothers is:

2899-1.96X576.72/√35 to 2899+1.96X576.72/√35 or 2707.9 to 3090.1.

The 95% confidence interval for never smoking mothers is:

2946-1.96X783.22/√35 to 2946+1.96X783.22/√35 or 2686.5 to 3205.5.

The two confidence intervals overlap, so we conclude that there is no difference in the mean neonate birth weight for smoking and never smoking mothers.

### Practice questions

1. The following is the income per capita for the 50 states of the U.S. in 1974. What is the mean income per capita?

 state Income Alabama 3624 Alaska 6315 Arizona 4530 Arkansas 3378 California 5114 Colorado 4884 Connecticut 5348 Delaware 4809 Florida 4815 Georgia 4091 Hawaii 4963 Idaho 4119 Illinois 5107 Indiana 4458 Iowa 4628 Kansas 4669 Kentucky 3712 Louisiana 3545 Maine 3694 Maryland 5299 Massachusetts 4755 Michigan 4751 Minnesota 4675 Mississippi 3098 Missouri 4254 Montana 4347 Nebraska 4508 Nevada 5149 New Hampshire 4281 New Jersey 5237 New Mexico 3601 New York 4903 North Carolina 3875 North Dakota 5087 Ohio 4561 Oklahoma 3983 Oregon 4660 Pennsylvania 4449 Rhode Island 4558 South Carolina 3635 South Dakota 4167 Tennessee 3821 Texas 4188 Utah 4022 Vermont 3907 Virginia 4701 Washington 4864 West Virginia 3617 Wisconsin 4468 Wyoming 4566

2. The following is the miles/U.S. gallon (mpg) for 32 car models in a certain car showroom (in the 1970s). What is the mean of this population?

 model mpg Mazda RX4 21.0 Mazda RX4 Wag 21.0 Datsun 710 22.8 Hornet 4 Drive 21.4 Hornet Sportabout 18.7 Valiant 18.1 Duster 360 14.3 Merc 240D 24.4 Merc 230 22.8 Merc 280 19.2 Merc 280C 17.8 Merc 450SE 16.4 Merc 450SL 17.3 Merc 450SLC 15.2 Cadillac Fleetwood 10.4 Lincoln Continental 10.4 Chrysler Imperial 14.7 Fiat 128 32.4 Honda Civic 30.4 Toyota Corolla 33.9 Toyota Corona 21.5 Dodge Challenger 15.5 AMC Javelin 15.2 Camaro Z28 13.3 Pontiac Firebird 19.2 Fiat X1-9 27.3 Porsche 914-2 26.0 Lotus Europa 30.4 Ford Pantera L 15.8 Ferrari Dino 19.7 Maserati Bora 15.0 Volvo 142E 21.4

3. The following two histograms are for the weights of certain male and female populations. Can you deduce which population has a higher mean weight?

3. The following table is for the mean wind speed for different classes of storm populations.

 class mean hurricane 85.97 tropical depression 27.27 tropical storm 45.81

And the following dot plot is for the data values (wind speed) of these classes.

We have many data values, but as they are equal in value, the points are superimposed.

Which points correspond to which mean?

5. You have two samples of diamonds. One sample is 38 ideal-cut diamonds, and the mean weight is 0.703 grams, and the standard deviation is 0.4. The second sample is 45 fair-cut diamonds, and the mean weight is 1.25 grams, and the standard deviation is 0.5.

Assuming that the two samples are randomly selected from a certain batch of diamonds in a certain factory. Is there a difference in the mean weight between the ideal and fair-cut diamonds in this batch?

1. This is population data so we can calculate the population mean directly:

• Sum of all numbers = 221790.
• Count the numbers of items in.your population. In this population, there are 50 items or states.
• Divide the first number by the second number.

The population mean = 221790/50 = 4435.8.

2. This is population data so we can calculate the population mean directly:

• Sum of all numbers = 642.9.
• In this population, there are 32 items or models.
• Divide the first number by the second number.

The population mean = 642.9/32 = 20.09 mpg.

3. We see that males’ weight tends to be heavier than females’ weight (more shifted to the right), so males’ mean weight will be larger than the mean weight for females.

4. Red points correspond to hurricane class, blue points to a tropical storm, and green points to a tropical depression.

It is because each group of dots’ center corresponds to the calculated mean value in the table.

5. We construct a 95% confidence interval for each mean:

The 95% confidence interval for the ideal-cut diamonds is:

0.703-1.96X0.4/√38 to 0.703+1.96X0.4/√38 or 0.58 to 0.83.

The 95% confidence interval for fair-cut diamonds is:

1.25-1.96X0.5/√45 to 1.25+1.96X0.5/√45 or 1.10 to 1.396.

The two confidence intervals do not overlap, so we conclude that fair-cut diamonds have a significantly heavier weight than ideal-cut diamonds in this batch.