Testing Variances

The one sample chi-square test is used to test the hypothesis that the population from which a sample is drawn has a specified variance or standard deviation. The F test compares the variances of two samples. Other tests, such as Levene's test and Bartlett's test, are used to compare the variances of multiple samples. They are covered in their own section on testing equality of variances.

The One Sample Chi-Square Test

The one sample chi-square test, also known as the chi-square test for variance, is used to test the hypothesis that a sample comes from a population with a specified variance or standard deviation.

Definition

The null hypothesis (H0) is that the population underlying the sample has a variance (σ2) equal to the proposed variance (σ02). The test statistic is the chi-square statistic, calculated as:

χ2=(n1)s2σ02

where s2 is the sample variance and n is the sample size. The chi-square statistic follows a chi-square distribution with n1 degrees of freedom.

Assumptions

The one sample chi-square test assumes that the sample is randomly selected from the population and that the population itself follows a normal distribution. If either of these assumptions is violated, the reliability of the chi-square test may be compromised.

This test should not be confused with the chi-square goodness-of-fit test, which is used to verify that a sample comes from a specified distribution.

Applications

The one sample chi-square test is used in various fields to test hypotheses about population variances. It is commonly used in quality control, medical research, and social sciences.

The OneSampleChiSquareTest class

The one sample chi-square test is implemented by the OneSampleChiSquareTest class. It has five constructors. The first constructor takes no arguments. The data and conditions for the test must be specified by setting properties of the OneSampleChiSquareTest object.

The remaining four constructors can be divided into two pairs. The second and third constructors take 2 or 3 arguments. The first argument is a Vector<T> that specifies the sample. The second argument is the proposed variance. The third, optional argument is a HypothesisType value that specifies whether the test is one or two-tailed. The default value is TwoTailed.

The last two constructors take 3 or 4 arguments. The first argument is an integer that specifies the size of the sample. The second argument specifies the variance of the sample. The third parameter specifies the variance to test against. The optional fourth argument is a HypothesisType value that specifies whether the test is one or two-tailed. The default value is TwoTailed.

Example

The test scores of a class on a national test are as follows:

61, 77, 61, 90, 72, 51, 75, 83, 53, 82, 82, 66, 68, 57, 61, 61, 78, 69, 65.

We want to investigate if the variance of the scores is greater than 5. The following code performs the test:

C#
var results = Vector.Create<double>(62, 77, 61, 94, 75, 82,
    86, 83, 64, 84, 68, 82, 72, 71, 85, 66, 61, 79, 81, 73);
var chiSquareTest = new OneSampleChiSquareTest(results, 5.0);
Console.WriteLine("Test statistic: {0:F4}", chiSquareTest.Statistic);
Console.WriteLine("P-value:        {0:F4}", chiSquareTest.PValue);
Console.WriteLine("Reject null hypothesis? {0}",
    chiSquareTest.Reject() ? "yes" : "no");

The value of the chi-square statistic turns out to be -2.4505 giving a p-value of 0.0143. As a result, the hypothesis that on average, the students in this class score no different than the national average is rejected at the 0.05 level.

Using pre-calculated values for the variance and sample size, the above example would look like this:

C#
double variance = results.Variance();
int sampleSize = 20;
chiSquareTest = new OneSampleChiSquareTest(sampleSize, variance, 5.0);

Once a OneSampleChiSquareTest object has been created, you can access other properties and methods common to all hypothesis test classes. For instance, to obtain a 95% confidence interval around the variance, the code would be:

C#
var varianceInterval = chiSquareTest.GetConfidenceInterval();
Console.WriteLine("95% Confidence interval for the variance: {0:F1} - {1:F1}",
    varianceInterval.LowerBound, varianceInterval.UpperBound);

The F-Test

The F-test, also known as the F-ratio test, is a two sample test that is used to test the hypothesis that the variances of two populations are equal.

Definition

The null hypothesis (H0) is that the two populations have the same variance (σ12=σ22). The test statistic is the F-ratio, calculated as:

F=s12s22

where s12 and s22 are the sample variances. The F-ratio follows an F-distribution with n11 and n21 degrees of freedom.

Assumptions

The F-test assumes that the samples are randomly selected from the populations, and that the populations themselves follow a normal distribution. If either of these assumptions is violated, the reliability of the F-test may be compromised.

The F-test was named in honor of Sir Ronald Fisher, who created the foundations for much of modern statistical analysis.

Applications

The F-test is used in various fields to compare the variances of two independent samples. It is commonly used in quality control, medical research, and social sciences.

The FTest class

The F-test is implemented by the FTest class. It has five constructors.

The first constructor takes no arguments. All test parameters must be provided by setting the properties of the FTest object.

The remaining four constructors can be divided into two pairs. The first pair has 2 or 3 arguments. The first two arguments are Vector<T> objects that represent the samples the test is to be applied to. The first constructor only has these two arguments. This creates a two-tailed test for equality of variances. The second constructor of this pair takes a third parameter: a HypothesisType value that specifies whether the test is one or two-tailed. One-tailed F tests are very common.

The second pair of constructors take 4 or 5 arguments. The first four arguments are, in order, the degrees of freedom and variance of the numerator, and the degrees of freedom and variance of the denominator. The fifth parameter, if present, is once again a HypothesisType value that specifies whether the test is one or two-tailed.

Example

Once again, we use the same data as before. However, this time we compare the results of one group of students to the results of a second group of students, with these test scores:

61, 80, 98, 90, 94, 65, 79, 75, 74, 86, 76, 85, 78, 72, 76, 79, 65, 92, 76, 80

We want to test if the variances of the two populations are equal. The code below performs this test:

C#
var results2 = Vector.Create<double>(61, 80, 98, 90, 94, 65,
    79, 75, 74, 86, 76, 85, 78, 72, 76, 79, 65, 92, 76, 80);
var fTest = new FTest(results, results2);
Console.WriteLine("Test statistic: {0:F4}", fTest.Statistic);
Console.WriteLine("P-value:        {0:F4}", fTest.PValue);
Console.WriteLine("Reject null hypothesis? {0}",
    fTest.Reject() ? "yes" : "no");

The value of the F-statistic is 0.9573 giving a p-value of 0.5374. As a result, the hypothesis that the variance of the scores of students from the first group is no different than that of the second group is not rejected at the 0.05 level.

C#
double numeratorDegreesOfFreedom = results.Length - 1;
double numeratorVariance = results.Variance();
double denominatorDegreesOfFreedom = results2.Length - 1;
double denominatorVariance = results2.Variance();
fTest = new FTest(numeratorDegreesOfFreedom, numeratorVariance,
    denominatorDegreesOfFreedom, denominatorVariance);

References

  • "Introduction to Probability Models" by Sheldon M. Ross.

  • "Probability and Statistics" by Morris H. DeGroot and Mark J. Schervish.

See Also