# When do we use "N" and when "N-1" when we calculate the variance of a sample with size N?

 3 I could not understand the following: We have the whole population of people. We sample N=100 people. We calculate the mean mu. When we calculate the variance of the sample, do we divide the sum of squares to N or to N-1? If we use N-1, why do we use it? Is the variance of the sample the actual or the estimated variance and is there a difference between them? We want to estimate the 95% confidence interval for the true population mean using the information from the sample. In the denominator we use sq.root (N). But when we calculate the standard deviation (the numerator), do we divide the sum of squares to N or to N-1? Thanks in advance for your help! I have been wondering about that for quite some time. asked 24 Jul '12, 16:35 Radoslav Ste... 61●1●4 accept rate: 0% Retagbot ♦ 15●18●171

 3 There is a famous tale in free verse concerning Hiawatha shooting arrows. The rather overstated lesson is that one may have to choose between a biased process that generally hits to one side of a bulls-eye and one that is perfectly centered and unbiased but so widely scattered that arrows seldom hit the target board at all. See Hiawatha Designs an Experiment by English statistician Sir Maurice George Kendall (1907 - 1983). answered 24 Jul '12, 19:13 Kenneth I. L... 22.0k●19●77●179 nice analogy! great story too (19 Aug '12, 20:28) George Brind...
 3 First of all, with N=100, using N or N-1 gives you only a small difference. Why would you use N? It gives you the maximum likelihood estimator (explained by Sebastian what that means). Why would you want to use N-1? It gives you an unbiased estimator. Unbiased means that the expectation of the estimator is equal to the parameter (sigma^2) you are estimating. Unfortunately you can't have both at the same time, but fortunately is in most practical cases just a theoretical difference. It is the actual variance of the sample (actually only when you use N; with N-1 it is not exactly the variance of the sample), which is probably a good (either max likelihood or unbiased; see above) estimator of the variance of the population. See 1. Both are possible. Both are valid estimators and none of them is perfect. For large N you shouldn't worry. But it is good to understand what are the pro's of either method. answered 24 Jul '12, 17:00 mrBB-4 2.9k●4●33
Question text:

Markdown Basics

• *italic* or _italic_
• **bold** or __bold__
• image?![alt text](/path/img.jpg "Title")
• numbered list: 1. Foo 2. Bar
• to add a line break simply add two spaces to where you would like the new line to be.
• basic HTML tags are also supported

×9,049
×3,192
×28