7.3 Confidence Intervals for Means

In chapter 4 we have seen how to compute the mean, median, standard deviation, and other descriptive statistics for a given data set, usually a sample from an underlying population. In this section we want to focus on estimating the mean of a population, given that we can compute the mean of a particular sample. In other words, if a sample of size, say, 100 is selected at random from some population, it is easy to compute the mean of that sample. It is equally easy to then use that sample mean as an estimate for the unknown population mean. But just because it's easy to do does not necessarily mean it's the right thing to do ...

For example, suppose we randomly selected 100 people, measured their height, and computed the average height for our sample to be, say, 164.432 cm. If we now wanted to know the average height of everyone in our population (say everyone in the US), it seems reasonably to say that the average height of everyone is 164.432 cm. However, if we think about it, it is of course highly unlikely that the average for the entire population comes out exactly the same as the average for our sample of just 100 people. It is much more likely that our sample mean of 164.432 cm is only approximately equal to the (unknown) population mean. It is the purpose of this chapter to clarify, using probabilities, what exactly we mean by "approximately equal".  In other words:

Can we use a sample mean to estimate an (unknown) population mean, and - most importantly - how accurate is our estimated answer.
Example: Consider some data for approximately 400 cars. We assume that this data has been collected at random. We would like to make predictions about all automobiles, based on that random sample. In particular, the data set lists miles per gallon, engine size, and weight of 400 cars, but we would like to know the average miles per gallon, engine size, and weight of all cars, based on this sample.

It is of course simple to compute the mean of the various variables of the sample, using Excel. For our sample data we find that:

mean gas mileage of the sample is 23.5 mpg with a standard deviation of 7.82 mpg, using 398 data values

But we need to know how well this sample mean predicts the actual and unknown population mean for the entire distribution. Our best guess is clearly that the average mpg for all cars is 23.5 mpg - it's after all pretty much the only number we have - but how good is that estimation?

<>In fact, we know more than just the sample mean. We also know that all sample means are distributed normally, according to the Central Limit Theorem, and that the distribution of all sample means (of which ours is just one) is normal with a mean of 23.5 mpg and a standard deviation of 7.82 / sqrt(398).

Using that information, let's make a quick d-tour into "mathematics land" - we will in a minute list a recipe for what we need to do, but for now, bear with me:

That interval (a, b) is known as a 95% confidence interval for the unknown mean. If the distribution had mean 0 and standard deviation 1 we could use some trial-and-error in Excel to compute the desired number a - note that if we assume that the mean was 0, a should be negative. In other words, we use Excel to compute NORMDIST(a, 0, 1, TRUE), where we guess some values of a: Thus, if the mean was 0 and the standard deviation was 1, the number a = -1.96 would be just about right, and using symmetry we can conclude that b = +1.96. However, we don't know the mean and standard deviation of our population, so what can we do ... Central Limit Theorem to the rescue!

According to the Central Limit Theorem, the mean and standard deviation of the distribution of all sample means is m and s / sqrt(N), where m is the sample mean and s is the sample standard deviation. Thus, the mean we are supposed to use is the sample mean m and the standard deviation s / sqrt(N), according to the Central Limit Theorem. Putting everything together, we found that we have computed a 95% confidence interval as follows:

from m - 1.96 * s / sqrt(N) to m + 1.96 * s / sqrt(N)
Note: The term s / sqrt(N) is also known as the Standard Error

The above explanation is perhaps somewhat confusing, and there are some parts where I've glossed over some important details. But the resulting formulas are simple, and those formulas will be what we want to focus on. In addition to the number 1.96 that we have derived for a 95% confidence interval, other numbers can be derived in a similar way for the 90% and 99% confidence intervals:

Confidence Interval for Mean (large sample size N > 30)

Suppose you have a sample with N data points, which has a sample mean m and standard deviation s. Then:

Using these formulas we can now estimate an unknown population mean with 90%, 95%, or 99% certainty. Other percentages are also possible, but these are the most frequently used ones.

Returning to our earlier example, where m = 23.5, s = 7.82, and N = 398 we have:

Note that a 99% confidence interval is large - i.e. includes more numbers - than a 90% confidence interval. That makes sense, since if we want to be more certain, we must allow for more values. Ultimately, a 100% confidence interval would simply consist of all possible numbers, or in an interval from -infinity to +infinity . That would certainly be correct, but is not very useful for practical applications.

While the above calculations can easily be done with a calculator (or Excel), our favorite computer program Excel provides - yes, you might have guessed it - a quick shortcut to obtain confidence intervals. We will proceed as follows:

What this means is that the sample mean of, say, "Mile per Gallon" is 23.5145. That sample mean may or may not be the same as the average MPG of all automobiles. But we have also computed a 90% confidence interval, which means, in this case, the following:

Under certain assumptions on the distribution of the population, we predict - based on our sample of 393 cars - that the average miles per gallon of all cars is somewhere between 23.5145 - 0.6459 = 22.87 and 23.5145 + 0.6459 = 24.16, and we are 90% certain that this answer is correct.
Please note that this 90% confidence interval is slightly different from the confidence interval we computed previously "by hand". That is no coincidence, because the derivation of the formulas for confidence intervals uses the Central Limit Theorem and that theorem, in effect, states that the distribution of the sample means is approximately normal. However, that approximation works best the larger N (the sample size) is. Excel uses a slightly different method to compute confidence intervals: Example: According to Excel, the average engine size in our sample of size N = 398 is 192.67 cubic inches, with a standard deviation of 104.55 cubic inches. Use these statistics to manually compute a 90% confidence interval. Then compare it with the figure Excel produces for the same interval.
<>Thus, since the sample size is large (certainly larger than 30) the intervals computed manually and with Excel are virtually identical. For the picky reader, note that Excel's interval is slighly larger, so it's slightly more conservative than the manual computation, but the difference in this case is neglibile.

Similarly, according to Excel the average weight in pounds of all cars is 2969.5161 - 69.5328 and 2969.5161 + 69.5328, and we are 90% certain that we are correct.


To recap: Instead of providing a point estimate for an unknown population mean (which would almost certainly be incorrect) we provide an interval instead, called confidence interval. Three particular confidence intervals are most common: a 90%, a 95%, or a 99% confidence interval. That means that:

Example: Suppose we compute, for the same sample data, both a 90% and a 99% confidence interval. Which one is larger ?

To answer this question, let's compute both a 90% and a 99% confidence interval for the "Horse Power" in the above data set about cars, using Excel. The procedure of computing the numbers is similar to the above; here are the answers:

That means, in general, that a 99% confidence interval is larger than a 90% confidence interval. That actually makes sense: if we want to be more sure that we have captured the true (unknown) population mean correctly, we need to make our interval larger. Hence, a 99% confidence interval must include more numbers than a 90% confidence interval; it is therefore wider than a 90% interval.