Testing Homogeneity of Variances

One of the assumptions underlying Analysis of Variance is that the variances across groups are identical. This property is called homogeneity of variances or heteroscedasticity. It is often desirable to verify this assumption using an appropriate hypothesis test. The two most common ones are Bartlett's test and Levene's test.

Bartlett's Test

Bartlett's test is a relatively fast test for homogeneity of variances. The test is based on the assumption that the samples are normally distributed. It is sensitive to violations of this assumption. In practical terms, this means that Bartlett's test cannot adequately distinguish between violation of homogeneity of variances and violation of the normality assumption.

The null hypothesis is always that the variances of all groups are equal. The alternative hypothesis is that at least one of the variances is different. Bartlett's test is always one-tailed, and uses a chi-square statistic.

Bartlett's test is implemented by the BartlettTest class. It has three constructors. The first constructor takes no arguments. The data and conditions for the test must be specified by setting properties of the BartlettTest object. The second constructor takes an array of Vector<T> objects, that contain the samples the test is to be applied to. The third constructor takes two arguments. The first is a vector containing the data for all the samples. The second argument is a IGrouping object (such as a CategoricalVector<T>) that specifies how the values in the first argument are to be grouped.

Example

We start with a collection of measurements of gear diameters from 10 batches. We want to verify that the variances of the diameters for the batches are equal. The data comes in two variables: one numerical vector with the measured diameters and one categorical that specifies the corresponding batch. If the batch vector is categorical, it can be used directly to group the diameters. Alternatively, we can split the diameter vector according to the batch:

C#
var batch = Vector.Create(100, i => 1 + i / 10).AsCategorical();
var diameter = Vector.Create(
    1.006, 0.996, 0.998, 1.000, 0.992, 0.993, 1.002, 0.999, 0.994, 1.000,
    0.998, 1.006, 1.000, 1.002, 0.997, 0.998, 0.996, 1.000, 1.006, 0.988,
    0.991, 0.987, 0.997, 0.999, 0.995, 0.994, 1.000, 0.999, 0.996, 0.996,
    1.005, 1.002, 0.994, 1.000, 0.995, 0.994, 0.998, 0.996, 1.002, 0.996,
    0.998, 0.998, 0.982, 0.990, 1.002, 0.984, 0.996, 0.993, 0.980, 0.996,
    1.009, 1.013, 1.009, 0.997, 0.988, 1.002, 0.995, 0.998, 0.981, 0.996,
    0.990, 1.004, 0.996, 1.001, 0.998, 1.000, 1.018, 1.010, 0.996, 1.002,
    0.998, 1.000, 1.006, 1.000, 1.002, 0.996, 0.998, 0.996, 1.002, 1.006,
    1.002, 0.998, 0.996, 0.995, 0.996, 1.004, 1.004, 0.998, 0.999, 0.991,
    0.991, 0.995, 0.984, 0.994, 0.997, 0.997, 0.991, 0.998, 1.004, 0.997);
BartlettTest bartlett = new BartlettTest(diameter, batch);
var variables = diameter.SplitBy(batch).ToArray();
BartlettTest bartlett2 = new BartlettTest(variables);

We can then run the test:

C#
Console.WriteLine("Test statistic: {0:F4}", bartlett.Statistic);
Console.WriteLine("P-value:        {0:F4}", bartlett.PValue);
Console.WriteLine("Reject null hypothesis? {0}",
    bartlett.Reject() ? "yes" : "no");

The value of the chi-square statistic is 20.7859 giving a p-value of 0.0136. As a result, the hypothesis that the variances are equal is rejected at the 0.05 level.

Once a BartlettTest object has been created, you can access other properties and methods common to all hypothesis test classes. For instance, to obtain the critical values for a significance level of 0.01 and 0.05, the code would be:

C#
Console.WriteLine("Critical value: {0:F4} at 95%",
    bartlett.GetUpperCriticalValue(0.05));
Console.WriteLine("Critical value: {0:F4} at 99%",
    bartlett.GetUpperCriticalValue(0.01));

The values of the critical values (16.9190 at 0.05 and 21.6660 at 0.01) show that the null hypothesis will be rejected at the 0.05 level.

Levene's Test

Levene's test is a slower but more robust test for homogeneity of variances. Levene's test is much less influenced by departures from normality than Bartlett's test. For this reason, it is often the test of choice.

As with Bartlett's test, the null hypothesis is always that the variances of all groups are equal. The alternative hypothesis is that at least one of the variances is different. Levene's test is always one-tailed, and uses an F statistic.

Levene's test comes in three flavors, depending on the measure of location used in the calculation of the statistic. The options are enumerated by the LeveneTestLocationMeasure enumeration:

Value

Description

Median

The median of the data is used as the location measure. This works best for normal data.

Mean

The mean of the data is used as the location measure. This gives better results when the data is skewed.

TrimmedMean

The 10% trimmed mean is used as the location measure. This gives better results when the data is heavy-tailed.

If no value is specified, the median is used.

Levene's test is implemented by the LeveneTest class. It has five constructors. The first constructor takes no arguments. The data and conditions for the test must be specified by setting properties of the LeveneTest object. The second constructor takes an array of Vector<T> objects that contain the samples the test is to be applied to. The third constructor takes one additional argument: a LeveneTestLocationMeasure value that specifies which measure of location to use in the calculation of the test statistic. This value can also be accessed and set through the LocationMeasure property.

The fourth constructor takes two arguments. The first is a vector containing the data for all the samples. The second argument is a IGrouping object (such as a CategoricalVector<T>) that specifies how the values in the first argument are to be grouped. The fifth constructor is like the fourth but takes an additional argument to specify the measure of location.

Example

We start from the same data as before: a collection of measurements of gear diameters from 10 batches. We want to verify that the variances of the diameters for the batches are equal. See the example with Bartlett's test for an illustration of how to prepare the data.

Here, we show how to create the LeveneTest object, and run the test:

C#
var levene = new LeveneTest(diameter, batch);
Console.WriteLine("Test statistic: {0:F4}", levene.Statistic);
Console.WriteLine("P-value:        {0:F4}", levene.PValue);
Console.WriteLine("Reject null hypothesis? {0}",
    levene.Reject() ? "yes" : "no");

The value of the F statistic is 1.7059 giving a p-value of 0.0991. As a result, the hypothesis that the variances are equal is not rejected at the 0.05 level.

The outcome of Levene's test is clearly different from that of Bartlett's test for the same data. The reason is most likely that the data are not distributed normally. Bartlett's test cannot distinguish non-homogeneity from departure from normality.

Once a LeveneTest object has been created, you can access other properties and methods common to all hypothesis test classes. For instance, to obtain the critical values for a significance level of 0.05 and 0.1, the code would be:

C#
Console.WriteLine("Critical value: {0:F4} at 95%",
    levene.GetUpperCriticalValue(0.05));
Console.WriteLine("Critical value: {0:F4} at 90%",
    levene.GetUpperCriticalValue(0.1));

The values of the critical values (1.9856 at 0.05 and 1.7021 at 0.01) show that the null hypothesis will not be rejected at the 0.05 level.