One-Way ANOVA with Repeated Measures

An ANOVA design where multiple observations are made for each subject is called a repeated measures design. A design with one factor and one subject variable is a one-way ANOVA model with repeated measures, sometimes shortened to RANOVA. The one-way analysis of variance with repeated measures is implemented by the OneWayRAnovaModel class.

Constructing One-Way Repeated Measures ANOVA Models

The OneWayRAnovaModel class has two constructors. The first constructor takes three arguments: a Vector<T> that specifies the dependent variable, and two CategoricalVector<T> objects, one that specifies the independent variable or factor, and one that specifies the subjects. All three variables must have the same number of observations.

As an example, we construct an ANOVA model to measure the effect of four treatments on five people:

C#
var person = Vector.Create(20, i => i / 4).AsCategorical();
var treatment = Vector.Create(20, i => i % 4).AsCategorical();
var score = Vector.Create(new double[] {
    30, 28, 16, 34,
    14, 18, 10, 22,
    24, 20, 18, 30,
    38, 34, 20, 44,
    26, 28, 14, 30 });
var anova1 = new OneWayRAnovaModel(score, treatment, person);

The second constructor takes four arguments. The first argument is a IDataFrame (a DataFrame<R, C> or Matrix<T>) that contains the variables you wish to use in the analysis. The second argument is the name of the dependent variable in the data frame. The third argument is the name of the independent variable or factor. The fourth argument is the name of the variable in the data frame that represents the subjects. Using the variables we created in the previous example, we get:

C#
var dataFrame = DataFrame.FromColumns(new Dictionary<string, object>()
{ {"score", score }, {"treatment", treatment }, { "person", person } });
var anova2 = new OneWayRAnovaModel(dataFrame, "score", "treatment", "person");

Performing the analysis

The results of the analysis can be obtained through the model's AnovaTable property. The ANOVA table for a one-way design with repeated measures has four rows, commonly labeled 'Between subjects,' 'Between groups,' 'Within groups,' and 'Total.'

The crucial data is provided by the Between Groups row, which shows the contribution to the total variation of the factor under consideration. It is the first row in the ANOVA table, and therefore has index 0. It can be retrieved through the GetModelRow method. The Between subjects row shows the variation due to the subjects. It has index 1.

The Within Groups row shows the variation of the data around the group means. It corresponds to the error or residual of the variation in the data after the model has been taken into account. The row is available through the ANOVA table's TotalRow property.

The Total row contains the summary data for the entire data set. It can be retrieved through the TotalRow property of the ANOVA table.

The AnovaModelRow objects obtained in this way shows the results of the test for significance of the variation due to the factor compared to the variation not explained by the factor. The FStatistic property gives the value of the F statistic for this ratio, while the PValue property gives the significance of the F statistic.

The Within Groups row shows the variation of the data around the group means. It corresponds to the error or residual of the variation in the data after the model has been taken into account. The row is available through the ANOVA table's TotalRow property.

The Total row contains the summary data for the entire data set. It can be retrieved through the TotalRow property of the ANOVA table.

The example below illustrates these properties:

C#
anova1.Fit();
var anovaTable = anova1.AnovaTable;
Console.WriteLine("F statistic: {0}", anovaTable.CompleteModelRow.FStatistic);
Console.WriteLine("P-value     : {0}", anovaTable.CompleteModelRow.PValue);
Console.WriteLine("Sum of sq. total: {0}",
    anovaTable.TotalRow.SumOfSquares);
Console.WriteLine("Sum of sq. error: {0}",
    anovaTable.ErrorRow.SumOfSquares);
Console.WriteLine("Sum of sq. treatment: {0}",
    anovaTable.GetModelRow(0).SumOfSquares);
Console.WriteLine("Sum of sq. subjects: {0}",
    anovaTable.GetModelRow(1).SumOfSquares);
Console.WriteLine(anovaTable.ToString());

For our example above, we find that the between groups F statistic is 18.1064, corresponding to a tiny p-value of less than 0.0001. For the between subjects row, we find an F statistic of 24.7589 with a p-value of about 0.0001. We conclude that the treatments have a significantly different effect, and that there is also significant variation between the subjects.

The means for combinations of subjects and treatments can be accessed through the model's Cells property, a Matrix<T> of Cell objects. The totals for the treatments and subjects can be accessed through the TreatmentTotals and SubjectTotals properties, respectively. In the example below, we print the group means for each treatment:

C#
var totals = anova1.TreatmentTotals;
var factor = anova1.GetFactor(0);
for (int i = 0; i < factor.Length; i++)
    Console.WriteLine("Mean for treatment '{0}': {1:F4}",
        factor[i], totals[i].Mean);

The TotalCell property returns the cell with totals for the complete data. The grand mean can be obtained from this cell:

C#
Console.WriteLine("Grand mean: {0:F4}", anova1.TotalCell.Mean);