Simple Time Series in IronPython QuickStart Sample
Illustrates how to perform simple operations on time series data using classes in the Numerics.NET.Statistics.TimeSeriesAnalysis namespace in IronPython.
This sample is also available in: C#, Visual Basic, F#.
Overview
This QuickStart sample demonstrates how to work with time series data using Numerics.NET. It shows basic operations for loading, analyzing and transforming financial market data.
The sample loads historical stock price data from a CSV file into a time series data frame. It demonstrates several key operations:
- Loading time series data from CSV files
- Accessing individual variables (columns) like Open, High, Low, Close prices and Volume
- Calculating basic statistics like mean prices
- Selecting data for specific time ranges
- Resampling time series to different frequencies (daily to monthly)
- Using different aggregation functions for each variable when resampling
- First price of period for Open
- Last price of period for Close
- Maximum price for High
- Minimum price for Low
- Sum for Volume
The code provides a practical example of handling financial market data, but the techniques shown can be applied to any time-stamped data series.
The code
import numerics
import clr
from System import DateTime
from Extreme.Data.Text import DelimitedTextFile
from Extreme.Statistics import *
from Extreme.Statistics.TimeSeriesAnalysis import *
# Illustrates the use of the TimeSeriesCollection class to represent
# and manipulate time series data.
# Time series collections can be created in a variety of ways.
# Here we use an ADO.NET data table:
def LoadTimeSeriesData():
filename = r"..\Data\MicrosoftStock.xls"
connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="+filename+";Extended Properties='Excel 8.0;HDR=Yes;IMEX=1'"
cnn = None
ds = DataSet()
try:
cnn = OleDbConnection(connectionString)
cnn.Open()
adapter = OleDbDataAdapter("Select * from [MicrosoftStock$]", cnn)
adapter.Fill( ds)
except OleDbException as ex:
print ex.InnerException
finally:
if cnn != None:
cnn.Close()
return ds.Tables[0]
seriesTable = LoadTimeSeriesData()
timeSeries = TimeSeriesCollection(seriesTable)
# The RowCount property returns the number of
# observations:
print "# observations:", timeSeries.Observations.Count
# The StartOfPeriodVariable property returns the
# DateTimeVariable that contains the start times
# for each period.
print "First date:", timeSeries.StartOfPeriodVariable.Minimum
# The EndOfPeriodVariable property returns the
# DateTimeVariable that contains the end times
# for each period.
print "Last date:", timeSeries.EndOfPeriodVariable.Maximum
# Data in a TimeSeriesCollection is always sorted
# in ascending time order.
#
# Accessing variables
#
# Variables are accessed by name or numeric index.
# They need to be cast to the appropriate specialized
# type (NumericalVariable, DateTimeVariable, etc.)
close = timeSeries["Close"]
print "Average close price: ${0:F2}".format(close.Mean)
# Variables can also be accessed by numeric index:
print "3rd variable:", timeSeries[2].Name
# The GetSubset method returns the data from the specified range.
y2004 = DateTime(2004, 1, 1)
y2005 = DateTime(2005, 1, 1)
series2004 = timeSeries.CreateSubset(y2004, y2005)
print "Opening price on the first trading day of 2004:", series2004["Open"].GetValue(0)
#
# Transforming the Frequency
#
# The first step is to define the aggregator function
# for each variable. This function specifies how each
# observation in the new time series is calculated
# from the observations in the original series.
# The Aggregator class has a number of
# pre-defined aggregator functions:
timeSeries["Open"].Aggregator = Aggregator.First
timeSeries["Close"].Aggregator = Aggregator.Last
timeSeries["High"].Aggregator = Aggregator.Maximum
timeSeries["Low"].Aggregator = Aggregator.Minimum
timeSeries["Volume"].Aggregator = Aggregator.Sum
# We can specify a subset of the series by providing
# the start and end dates.
# The TransformFrequency method returns a new series
# containing the aggregated data:
monthlySeries = timeSeries.TransformFrequency(y2004, y2005, DateTimeUnit.Month)
# We can now print the results:
print "Monthly statistics for Microsoft Corp. (MSFT)"
print "Month Open Close High Low Volume"
for row in range(monthlySeries.Observations.Count):
print " {0:MMM} {1:.2f} {2:.2f} {3:.2f} {4:.2f} {5:10}" \
.format(monthlySeries.StartOfPeriodVariable[row], monthlySeries["Open"].GetValue(row), \
monthlySeries["Close"].GetValue(row), monthlySeries["High"].GetValue(row), monthlySeries["Low"].GetValue(row), monthlySeries["Volume"].GetValue(row))