Resampling
The frequency of a time series determines the time period covered by each observation. It is possible to derive a new time series from an existing one so that each observation in the new series covers a longer time period. Transforming the frequency of a time series is called resampling and involves combining the observations into groups corresponding to the longer time frame, and for each group calculating one value that represents all observations in the group. For example, the daily volume of a stock can be combined into the monthly volume by adding all the observations for the month.
Resampling was covered in the section on Grouping and Aggregation. This section illustrates some additional functionality specific to time-based datasets.
Creating time based indexes
Resampling data frames
To transform the frequency of a DataFrame<R, C>, every variable in the data frame that you wish to transform must have its Aggregator property set. Most commonly, this is one of the static (Shared in Visual Basic) members of the Aggregators class.
Once these properties have been set, you can call the TransformFrequency(DateTimeScale) method. This method has three overloads. The first takes just one parameter: a DateTimeUnit value that specifies the frequency of the new time series collection. The example below transforms a daily series into a monthly series:
The second overload takes three arguments. The first and second parameters are DateTime values that specify the lower bound and the upper bound of the new time series. The new time series will only contain observations from the specified period. The third argument is once again a DateTimeUnit value that specifies the frequency of the new time series collection.
The third overload takes a total of seven parameters. As before, the first three specify the upper and lower bound of the time period, and the time unit of the new time series collection. The fourth and fifth parameters are BoundaryIntervalBehavior values that specify how the first and last boundary intervals are to be handled if the interval does not correspond to a complete time unit. The possible values are listed in the table below:
Value | Description |
---|---|
Exclude | The entire interval is excluded. |
Include | The interval is included. |
CompleteAndInclude | The interval is extended to a full larger interval and the extended interval is included. |
A value of Exclude means that a partial interval is to be excluded from the time series. There is no observation corresponding to the interval in the new time series. A value of Include means that a partial interval is included as it is. The aggregated value will cover only a partial time period. A value of CompleteAndInclude means that a partial interval is to be extended towards the past (if it is the first) or towards the future (if it is the last) so that it covers a complete time unit. The aggregated observations based on this entire interval are then included in the time series.
The last two parameters are an integer and a DateTimeUnit value that together specify an offset of the time periods within the larger time unit. For instance, a time series with a quarterly frequency may have the quarter start at the 2nd month of the quarter (February, May, August, November). In this case, the sixth parameter is set to one, and the seventh is set to DateTimeUnit.Month.
The following example loads data from an Excel worksheet into a TimeSeriesCollection. The data contains price information for a stock (open, close, high, low, and volume). Appropriate aggregators are set and a new TimeSeriesCollection object is created that contains the monthly data for the year 1996: