Mean Tests in IronPython QuickStart Sample

Illustrates how to use various tests for the mean of one or more sanples using classes in the Numerics.NET.Statistics.Tests namespace in IronPython.

This sample is also available in: C#, Visual Basic, F#.

Overview

This QuickStart sample demonstrates how to perform statistical hypothesis testing on sample means using the Numerics.NET.Statistics.Tests namespace.

The sample works with a scenario involving test scores from two groups of students compared against a national average. It shows three different approaches to testing means:

One-sample z-test: Used when comparing a sample mean to a known population mean and standard deviation. The sample demonstrates how to:
- Create a test using sample data and population parameters
- Calculate and interpret test statistics and p-values
- Work with different significance levels
- Generate confidence intervals
One-sample t-test: Used when comparing a sample mean to a population mean when the population standard deviation is unknown. The example shows:
- How to construct and execute the test
- Interpreting test results
- Generating confidence intervals
Two-sample t-test: Used for comparing means between two independent samples. The sample shows:
- How to set up the test with two sample groups
- Interpreting the results to determine if the groups differ significantly

Throughout the sample, you’ll learn how to:

Work with numerical data using Vector objects
Calculate basic statistics like means and standard deviations
Set up and run different types of hypothesis tests
Interpret test results using p-values and confidence intervals
Make statistical decisions using different significance levels

The code

import numerics

from Extreme.Mathematics import *
from Extreme.Statistics import *
from Extreme.Statistics.Tests import *

# Demonstrates how to use hypothesis tests for the mean 
# of one or two distributions.

# This QuickStart Sample uses the scores obtained by the students
# in two groups of students on a national test.
# 
# We want to know if the scores for these two groups of students
# are significantly different from the national average, and
# from each other.

# The mean and standard deviation of the complete population:
nationalMean = 79.3
nationalStandardDeviation = 7.3

print "Tests for group 1"

# First we create a NumericalVariable that holds the test scores.
group1Data = Vector([ \
	62, 77, 61, 94, 75, 82, 86, 83, 64, 84, \
    68, 82, 72, 71, 85, 66, 61, 79, 81, 73 ])
group1Results = NumericalVariable("Class 1", group1Data)
			
# We can get the mean and standard deviation of the group right away:
print "Mean for the group: {0:.1f}".format(group1Results.Mean)
print "Standard deviation: {0:.1f}".format(group1Results.StandardDeviation)
			
#
# One Sample z-test
#

print "\nUsing z-test:"
# We know the population standard deviation, so we can use the z-test, # implemented by the OneSampleZTest group. We pass the sample variable
# and the population parameters to the constructor.
zTest = OneSampleZTest(group1Results, nationalMean, nationalStandardDeviation)
# We can obtan the value of the test statistic through the Statistic property, # and the corresponding P-value through the Probability property:
print "Test statistic: {0:.4f}".format(zTest.Statistic)
print "P-value:        {0:.4f}".format(zTest.PValue)

# The significance level is the default value of 0.05:
print "Significance level:     {0:F2}".format(zTest.SignificanceLevel)
# We can now print the test scores:
print "Reject null hypothesis?", "yes" if zTest.Reject() else "no"
# We can get a confidence interval for the current significance level:
meanInterval = zTest.GetConfidenceInterval()
print "95% Confidence interval for the mean: {0:.1f} - {1:.1f}".format(meanInterval.LowerBound, meanInterval.UpperBound)

# We can get the same scores for the 0.01 significance level by explicitly
# passing the significance level as a parameter to these methods:
print "Significance level:     {0:F2}".format(0.01)
print "Reject null hypothesis?", "yes" if zTest.Reject(0.01) else "no"
# The GetConfidenceInterval method needs the confidence level, which equals
# 1 - the significance level:
meanInterval = zTest.GetConfidenceInterval(0.99)
print "99% Confidence interval for the mean: {0:.1f} - {1:.1f}".format(meanInterval.LowerBound, meanInterval.UpperBound)


# 
# One sample t-test
#

print "\nUsing t-test:"
# Suppose we only know the mean of the national scores, # not the standard deviation. In this case, a t-test is 
# the appropriate test to use.
tTest = OneSampleTTest(group1Results, nationalMean)
# We can obtan the value of the test statistic through the Statistic property, # and the corresponding P-value through the Probability property:
print "Test statistic: {0:.4f}".format(tTest.Statistic)
print "P-value:        {0:.4f}".format(tTest.PValue)

# The significance level is the default value of 0.05:
print "Significance level:     {0:.2f}".format(tTest.SignificanceLevel)
# We can now print the test scores:
print "Reject null hypothesis?", "yes" if tTest.Reject() else "no"
# We can get a confidence interval for the current significance level:
meanInterval = tTest.GetConfidenceInterval()
print "95% Confidence interval for the mean: {0:.1f} - {1:.1f}".format(meanInterval.LowerBound, meanInterval.UpperBound)

			
# 
# Two sample t-test
#

print "\nUsing two-sample t-test:"
# We want to compare the scores of the first group to the scores 
# of a second group from the same school. Once again, we start
# by creating a NumericalVariable containing the scores:
group2Data = Vector([ \
	61, 80, 98, 90, 94, 65, 79, 75, 74, 86, \
    76, 85, 78, 72, 76, 79, 65, 92, 76, 80 ])
group2Results = NumericalVariable("Class 2", group2Data)

# To compare the means of the two groups, we need the two sample
# t test, implemented by the TwoSampleTTest group:
tTest2 = TwoSampleTTest(group1Results, group2Results, SamplePairing.Paired, VarianceAssumption.None)
# We can obtan the value of the test statistic through the Statistic property, # and the corresponding P-value through the Probability property:
print "Test statistic: {0:.4f}".format(tTest2.Statistic)
print "P-value:        {0:.4f}".format(tTest2.PValue)

# The significance level is the default value of 0.05:
print "Significance level:     {0:.2f}".format(tTest2.SignificanceLevel)
# We can now print the test scores:
print "Reject null hypothesis?", "yes" if tTest2.Reject() else "no"