Confidence interval and the Student's t-test 

By Joy Ying Zhang, joy+@cs.cmu.edu


 

Main Ideas

The story starts here: Let's take a look at the Standard Normal Distribution
The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1.

graphic of normal curve 
with (-infinity, -.5) and (-.5, 1) shaded

z -3.0 -2.0 -1.0 0 1 2 3
proportion (-∞,z)  0.0013 0.023 0.159 0.5 0.841 0.977 0.9987

Or we can view this in another way:

 range proportion
(-1,+1) 0.6826
(-2,+2) 0.9544
(-3,+3) 0.9974

We can interpret the table above as: for 68.26% of times, z will fall into range (- 1,+ 1), for 95.44% z will be in the range (-2,+2) and for 99.74% z will have value in (-3,+3). Remember this, and we will be back. 

===========================

Let's switch the topic to confidence interval for a moment:

If we want to find out the mean of a population, but

We can only take samples from the population. Assume this population has a true mean μ and true standard deviation σ, but of course we don't know their values. Suppose each sample is of size n. For each sample, we can calculate the sample mean X  . The mean of all the sample means: X1, X2,X3... is μX, and the standard deviation of the sampling distribution of the sample mean is σX , also called the standard error of X

BTW, There appear to be two different definitions of the standard error. 1) The standard error of a sample of sample size n is the sample's standard deviation divided by . It therefore estimates the standard deviation of the sample mean based on the population mean (Press et al. 1992, p. 465). Note that while this definition makes no reference to a normal distribution, many uses of this quantity implicitly assume such a distribution. 2) The standard error of an estimate may also be defined as the square root of the estimated error variance of the quantity,.

We have:

In conclusion, suppose a sample of size n is taken from a normal population with mean μ and standard deviation σ, the sampling distribution of X is also a normal distribution with mean  μX = μ and standard deviation σX=σ/ . The sampling distribution is normal if the original population is normal.

Now, the standardized version of X is:

~ has a standard normal distribution

This means, whatever μ is, we have: 

(see table)

Or, in other words,

Or, that the interval contains the population mean (μ) with 99.54% confidence. This is a 99.54% confidence interval for μ.

In conclusion: 

We can measure the confidence intervals for the "real" mean μ if:

Here are some critical Z values. Z-values can be calculated and demonstrated here

 α

Confidence   Zα/2

0.1

90%

1.64

0.05

95%

1.96

0.01

99%

2.58

0.001

99.9%

3.29

=========================

In the above section, we assume that we know the standard deviation (σ) of the population and transferred the X into Z which is standard normal distribution and use the z-value to estimate the confidence intervals for the population mean μ

~ standard normal distribution 

Yet, in the cases  when σ is unknown, we can only estimate it with the sample standard deviation S and transfer the X into T which does not have a standard normal distribution. T follows what is called Student's t-distribution.

~ t-distribution

=========================

"Student" (real name: W. S. Gossett [1876-1937]) developed statistical methods to solve problems stemming from his employment in a brewery. 

The t-distribution has one parameter called the degree of freedom (df), DF=n-1

The t-distribution is similar to the normal distribution:

The main differences between the t-distribution and the normal distribution is in tails (Play around with DF and see the difference of the tails):


T-test for one variable: calculating confidence interval for mean μ, σ unknown

        

 α

 0.1 0.05 0.02 0.01 0.002 0.001
Confidence 90% 95% 98% 99% 99.8% 99.9%
DF=1 6.314 12.71 31.82 63.66 318.3 636.6
DF=10 1.812 2.228 2.764 3.169 4.144 4.587
DF=20 1.725 2.086 2.528 2.845 3.552 3.850
DF=∞ (same as Z distribution)  1.645 1.960 2.326 2.576 3.091 3.291

Which T-test to use  

(for more information of how to choose a statistical test)

Goal and Data Type of T-test

Assumption

Comments

T

DF
Compare one group to a hypothetical value one-sample t test Subjects are randomly drawn from a population and the distribution of the mean being tested is normal Usually used to compare the mean of a sample to a know number (often 0) n-1
Compare two unpaired groups unpaired t test Two-sample assuming 
equal variance 
(homoscedastic t-test)
Two samples are referred to as independent if the observations in one sample are not in any way related to the observations in the other. This is also used in cases where one randomly assign subjects to two groups, give first group treatment A and the second group treatment B and compare the two groups

n1+n2-2

Two-sample assuming 
unequal variance (heteroscedastic t-test)
The variance in the two groups are extremely different. e.g. the two samples are of very different sizes
Compare two paired groups paired t test The observed data are from the same subject or from a matched subject and are drawn from a population with a normal distribution

does not assume that the variance of both populations are equal

used to compare means on the same or related subject over time or in differing circumstances; subjects are often tested in a before-after situation n-1

 


Data set

Subject ID Data
Males Females Males Females
70 87 165.9 212.1
71 89 210.3 203.5
72 90 166.8 210.3
76 94 182.3 228.4
77 97 182.1 206.2
78 99 218 203.2
80 101 170.1 224.9
  102   202.6

t-Test: Two-Sample Assuming Equal Variances

To compute the two-sample t-test two major computations are needed before computing the t-test. First, you need to estimate the pooled standard deviation of the two samples. The pooled standard deviation gives an weighted average of the standard deviations of the two samples. The pooled standard deviation is going to be between the two standard deviations, with greater weight given to the standard deviation from a larger sample. The equation for the pooled standard deviation is:

In all work with two-sample t-test the degrees of freedom or df is:

The formula for the two sample t-test is:

For example, for this data set

t-Test: Two-Sample Assuming Equal Variances  
   
Variable 1 Variable 2
Mean 185.0714286 211.4
Variance 443.802381 101.0114286
Observations 7 8
Pooled Variance 259.2226374  
Hypothesized Mean Difference 0  
df 13  
t Stat -3.159651739  
P(T<=t) one-tail 0.0037652  
t Critical one-tail 1.770931704  
P(T<=t) two-tail 0.0075304  
t Critical two-tail 2.16036824  

t-Test: Two-Sample Assuming Unequal Variances

Assumption:

1. The samples (n1 and n2) from two normal populations are independent
2. One or both sample sizes are less than 30
3. The appropriate sampling distribution of the test statistic is the t distribution
4. The unknown variances of the two populations are not equal

Note in this case the Degree of Freedom is measured by 

and round up to integer.

For example, for this data set

t-Test: Two-Sample Assuming Unequal Variances
   
Variable 1 Variable 2
Mean 185.0714286 211.4
Variance 443.802381 101.0114286
Observations 7 8
Hypothesized Mean Difference 0  
df 8  
t Stat -3.01956254  
P(T<=t) one-tail 0.008285256  
t Critical one-tail 1.85954832  
P(T<=t) two-tail 0.016570512  
t Critical two-tail 2.306005626  

Paired Student's t-test

For each pair of data, think of creating a new sequence of data: differences.

Data ID Value X Value Y (after treatment) Difference
1 X1 Y1 X1-Y1
2 X2 Y2 X2-Y2
i Xi Yi Xi-Yi
... ... ... ...
n Xn Yn Xn-Yn

Hypothesis: Difference = μ, usually, if we just want to test if two systems are different μ=0

Apply the one-sample t-test on the difference sequence

Or,

Given two paired sets and of n measured values, the paired t-test determines if they differ from each other in a significant way. Let

with degree of freedom = n-1

For example, for the following data set:

ID X Y X-Y
1 154.3 230.4 76.1
2 191 202.8 11.8
3 163.4 202.8 39.4
4 168.6 216.8 48.2
5 187 192.9 5.9
6 200.4 194.4 -6
7 162.5 211.7 49.2

t-Test: Paired Two Sample for Means    
     
Variable 1 Variable 2 Variable1-Variable2
Mean 175.314 207.400 32.086
Variance 300.788 176.237 848.508
Observations 7.000 7.000  
Hypothesized Mean Difference 0.000    
df 6    
t Stat -2.914    
P(T<=t) one-tail 0.013    
t Critical one-tail 1.943    
P(T<=t) two-tail 0.027    
t Critical two-tail 2.447    


Critical Values

You can find the t test critical values online

Or, you can use the perl library to calculate the values:

Statistics::Distributions - Perl module for calculating critical values and upper probabilities of common statistical distributions (download the package)

e.g.

$tprob=Statistics::Distributions::tprob (3,6.251); print "upper probability of the t distribution (3 degrees of " ."freedom, t = 6.251): Q = 1-G = $tprob\n";

 

Be Careful Here:

The returned p value stands for the proportion of the area under the curve between t and ∞, if one wants to measure the confidence C,

C=1-2p


Confidence and Precision

The confidence level of a confidence interval is an assessment of how confident we are that the true population mean is within the interval.
The precision of the interval is given by its width (the difference between the upper and lower endpoint).  Wide intervals do not provide us with very precise information about the location of the true population mean. Short intervals provide us with very precise information about the location of the population mean.

If the sample size n remains the same:


Generally confidence levels are chosen to be between about 90% and 99%. These confidence levels usually provide reasonable precision and confidence.


References:

since Dec. 1, 2006
View My Stats