Simple Regression in IronPython QuickStart Sample
Illustrates how to perform a simple linear regression using the SimpleRegressionModel class in IronPython.
This sample is also available in: C#, Visual Basic, F#.
Overview
This QuickStart sample demonstrates how to perform simple linear regression analysis using the SimpleRegressionModel class in Numerics.NET.
The sample shows two different regression scenarios using data from the National Institute of Standards and Technology’s Statistical Reference Datasets (NIST StRD). The first example demonstrates a no-intercept model using the ‘NoInt1’ dataset, while the second uses the ‘Norris’ dataset with an intercept term.
For each model, the sample shows how to:
- Create and populate the dependent and independent variable data
- Construct and configure the SimpleRegressionModel
- Fit the model to the data
- Access key regression statistics including:
- Parameter estimates and standard errors
- Residual standard error
- R-squared and adjusted R-squared values
- F-statistic
- ANOVA table results
The code illustrates different ways to input data, including using simple arrays and Vector objects. It also demonstrates how to control model properties like the intercept term.
The code
import numerics
from Extreme.Mathematics import *
from Extreme.Statistics import *
# Illustrates the use of the SimpleRegressionModel class
# to perform multiple linear regression.
# Simple linear regression can be performed using
# the SimpleRegressionModel class. There are some special constructors
# for simple linear regression, with only one independent variable.
#
# This QuickStart sample uses data from the National Institute
# for Standards and Technology's Statistical Reference Datasets
# library at http:#www.itl.nist.gov/div898/strd/.
# Note that, due to round-off error, the results here will not be exactly
# the same as the NIST results, which were calculated using 500 digits
# of precision!
# Model 1 uses the 'NoInt1' dataset. The model has no intercept.
# First, we construct Double arrays containing the data for
# the dependent and independent variables.
yData1 = Vector([ 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140 ])
xData1 = Vector([ 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70 ])
# Next, we create the regression model. We can pass the data arrays directly.
model1 = SimpleRegressionModel(yData1, xData1)
model1.NoIntercept = True
model1.Compute()
for parameter in model1.Parameters:
print parameter.ToString()
print "Residual standard error: {0:.2f}".format(model1.StandardError)
print "R-Squared: {0:.3f}".format(model1.RSquared)
print "Adjusted R-Squared: {0:.3f}".format(model1.AdjustedRSquared)
print "F-statistic: {0:.3f}".format(model1.FStatistic)
print model1.AnovaTable.ToString()
# Model 2 uses the 'Norris' dataset.
print "\n\nModel 2"
yData2 = Vector([ 0.1, 338.8, 118.1, 888.0, 9.2, 228.1, 668.5, 998.5, \
449.1, 778.9, 559.2, 0.3, 0.1, 778.1, 668.8, 339.3, 448.9, \
10.8, 557.7, 228.3, 998.0, 888.8, 119.6, 0.3, 0.6, 557.6, \
339.3, 888.0, 998.5, 778.9, 10.2 , 117.6, 228.9, 668.4, \
449.2, 0.2 ])
dependent2 = NumericalVariable("y", yData2)
xData2 = Vector([ 0.2, 337.4, 118.2, 884.6, 10.1, 226.5, 666.3, 996.3, \
448.6, 777.0, 558.2, 0.4, 0.6, 775.5, 666.9, 338.0, 447.5, \
11.6, 556.0, 228.1, 995.8, 887.6, 120.2, 0.3, 0.3, 556.8, \
339.1, 887.2, 999.0, 779.0, 11.1, 118.3, 229.2, 669.1, \
448.9, 0.5 ])
independent2 = NumericalVariable("x", xData2)
# Next, we create the regression model, using the NumericalVariable objects
# we just created:
model2 = SimpleRegressionModel(dependent2, independent2)
model2.Compute()
for parameter in model2.Parameters:
print parameter.ToString()
print "Residual standard error: {0:.8f}".format(model2.StandardError)
print "R-Squared: {0:.8f}".format(model2.RSquared)
print "Adjusted R-Squared: {0:.8f}".format(model2.AdjustedRSquared)
print "F-statistic: {0:.3f}".format(model2.FStatistic)
print model2.AnovaTable.ToString()
# The data can also be supplied as two float arrays.
# This is not illustrated here.