Polynomial Regression in IronPython QuickStart Sample

Illustrates how to fit data to polynomials using the PolynomialRegressionModel class in IronPython.

This sample is also available in: C#, Visual Basic, F#.

Overview

This QuickStart sample demonstrates how to perform polynomial regression analysis using the PolynomialRegressionModel class in Numerics.NET.

The sample uses real-world calibration data from the National Institute of Standards and Technology’s Statistical Reference Datasets library. Specifically, it works with the ‘Pontius’ dataset containing load cell calibration measurements, where deflection is modeled as a function of applied load.

The sample shows how to:

Create and fit a polynomial regression model
Access and interpret the regression parameters and their statistical properties
Calculate confidence intervals for the parameters
Obtain model quality metrics like R-squared and F-statistics
Generate an ANOVA table for the regression analysis

The code includes detailed comments explaining each step and demonstrates proper error handling and statistical interpretation of the results. While the calculations may differ slightly from NIST’s published results due to numerical precision differences, the sample illustrates the practical application of polynomial regression for scientific data analysis.

The code

import numerics

from Extreme.Mathematics import *
from Extreme.Statistics import *

# Illustrates the use of the PolynomialRegressionModel class
# to perform polynomial regression.

# Polynomial regression can be performed using 
# the PolynomialRegressionModel class.
#
# This QuickStart sample uses data from the National Institute
# for Standards and Technology's Statistical Reference Datasets
# library at http:#www.itl.nist.gov/div898/strd/.

# Note that, due to round-off error, the results here will not be exactly
# the same as the NIST results, which were calculated using 500 digits
# of precision!

# We use the 'Pontius' dataset, which contains measurement data
# from the calibration of load cells. The independent variable is the load.
# The dependent variable is the deflection.
deflectionData = Vector([ \
    .11019, .21956, .32949, .43899, .54803, .65694, \
    .76562, .87487, .98292, 1.09146, 1.20001, 1.30822, \
    1.41599, 1.52399, 1.63194, 1.73947, 1.84646, 1.95392, \
    2.06128, 2.16844, .11052, .22018, .32939, .43886, \
    .54798, .65739, .76596, .87474, .98300, 1.09150, \
    1.20004, 1.30818, 1.41613, 1.52408, 1.63159, 1.73965, \
    1.84696, 1.95445, 2.06177, 2.16829 ])
loadData = Vector([ \
    150000, 300000, 450000, 600000, 750000, 900000, \
    1050000, 1200000, 1350000, 1500000, 1650000, 1800000, \
    1950000, 2100000, 2250000, 2400000, 2550000, 2700000, \
    2850000, 3000000, 150000, 300000, 450000, 600000, \
    750000, 900000, 1050000, 1200000, 1350000, 1500000, \
    1650000, 1800000, 1950000, 2100000, 2250000, 2400000, \
    2550000, 2700000, 2850000, 3000000 ])

deflection = NumericalVariable("deflection", deflectionData)
load = NumericalVariable("load", loadData)

# Now create the regression model. We supply the dependent and independent
# variable, and the degree of the polynomial:
model = PolynomialRegressionModel(deflection, load, 2)

# The Compute method performs the actual regression analysis.
model.Compute()

# The Parameters collection contains information about the regression 
# parameters.
print "Variable                  Value   Std.Error   t-stat  p-Value"
for parameter in model.Parameters:
    # Parameter objects have the following properties:
    print "{0:19}{1:12.4e}{2:12.2e}{3:9.2f}{4:9.4f}".format( # Name, usually the name of the variable:
		parameter.Name, # Estimated value of the parameter:
		parameter.Value, # Standard error:
		parameter.StandardError, # The value of the t statistic for the hypothesis that the parameter
		# is zero.
		parameter.Statistic, # Probability corresponding to the t statistic.
		parameter.PValue)
print 

# In addition to these properties, Parameter objects have a GetConfidenceInterval
# method that returns a confidence interval at a specified confidence level.
# Notice that individual parameters can be accessed using their numeric index.
# Parameter 0 is the intercept, if it was included.
confidenceInterval = model.Parameters[0].GetConfidenceInterval(0.95)
print "95% confidence interval for constant term: {0:.4e} - {1:.4e}".format(
    confidenceInterval.LowerBound, confidenceInterval.UpperBound)
print 
			
# There is also a wealth of information about the analysis available
# through various properties of the LinearRegressionModel object:
print "Residual standard error: {0:.3e}".format(model.StandardError)
print "R-Squared:               {0:.4f}".format(model.RSquared)
print "Adjusted R-Squared:      {0:.4f}".format(model.AdjustedRSquared)
print "F-statistic:             {0:.4f}".format(model.FStatistic)
print "Corresponding p-value:   {0:.5e}".format(model.PValue)
print 

# Much of this data can be summarized in the form of an ANOVA table:
print model.AnovaTable.ToString()