One-Way Anova in IronPython QuickStart Sample

Illustrates how to use the OneWayAnovaModel class to perform a one-way analysis of variance in IronPython.

This sample is also available in: C#, Visual Basic, F#.

Overview

This QuickStart sample demonstrates how to perform a one-way analysis of variance (ANOVA) using the OneWayAnovaModel class in Numerics.NET.

The sample analyzes a marketing dataset examining the effect of package color on product sales across 12 stores. The data includes sales figures for packages in three colors (red, blue, and green). Using one-way ANOVA, the sample shows how to:

  • Create a DataFrame from a collection of anonymous objects containing the sales data
  • Construct and fit a OneWayAnovaModel using both direct variable specification and formula syntax
  • Verify the balance of the experimental design
  • Generate and display a classic ANOVA table
  • Access group means and other statistics for each color group (treatment level)
  • Calculate overall summary statistics like the grand mean

The code demonstrates proper model construction, checking model assumptions, and extracting both detailed and summary statistics from the analysis. This example is particularly relevant for researchers and analysts conducting experiments with a single categorical factor.

The code

import numerics

from System import Array

from Extreme.Statistics import *

# Illustrates the use of the OneWayAnovaModel class for performing 
# a one-way analysis of variance.

# This QuickStart Sample investigates the effect of the color of packages
# on the sales of the product. The data comes from 12 stores.
# Packages can be either red, green or blue.

# Set up the data in an ADO.NET data table.
import clr
clr.AddReference('System.Data')
from System.Data import DataTable

dataTable = DataTable()
dataTable.Columns.Add("Store", int)
dataTable.Columns.Add("Color", str)
dataTable.Columns.Add("Shape", str)
dataTable.Columns.Add("Sales", float)

dataTable.Rows.Add(Array[object]([1, "Blue", "Square", 6]))
dataTable.Rows.Add(Array[object]([2, "Blue", "Square", 14]))
dataTable.Rows.Add(Array[object]([3, "Blue", "Rectangle", 19]))
dataTable.Rows.Add(Array[object]([4, "Blue", "Rectangle", 17]))

dataTable.Rows.Add(Array[object]([5, "Red", "Square", 18]))
dataTable.Rows.Add(Array[object]([6, "Red", "Square", 11]))
dataTable.Rows.Add(Array[object]([7, "Red", "Rectangle", 20]))
dataTable.Rows.Add(Array[object]([8, "Red", "Rectangle", 23]))

dataTable.Rows.Add(Array[object]([9, "Green", "Square", 7]))
dataTable.Rows.Add(Array[object]([10, "Green", "Square", 11]))
dataTable.Rows.Add(Array[object]([11, "Green", "Rectangle", 18]))
dataTable.Rows.Add(Array[object]([12, "Green", "Rectangle", 10]))

# Construct the OneWayAnovaModel object.
anova =  OneWayAnovaModel(dataTable, "Sales", "Color")
# Verify that the design is balanced:
if not anova.IsBalanced:
	print "The design is not balanced."
# Perform the calculation.
anova.Compute()
			
# The AnovaTable property gives us a classic anova table.
# We can write the table directly to the console:
print anova.AnovaTable.ToString()
print 
			
# A Cell object represents the data in a cell of the model, # i.e. the data related to one level of the factor.
# We can use it to access the group means of our color groups.

# First we get the CategoricalScale object so we can easily iterate
# through the levels:
colorFactor = anova.GetFactor(0)
for level in colorFactor.GetLevels():
	print "Mean for group '{0}': {1:.4f}".format(level, anova.Cells[level].Mean)
			
# We could have accessed the cells directly as well:
print "Variance for blue packages:", anova.Cells["Blue"].Variance
print 
		
# We can get the summary data for the entire model
# by using the special index 'Cell.All':
totalSummary = anova.Cells[Cell.All]
print "Summary data:"
print "# observations:", totalSummary.Count
print "Grand mean:    ", totalSummary.Mean