Aggregators

Aggregation is the process of reducing a series of values to a single value that represents some property of the series. An aggregate function or aggregator is a function that performs this operation. This section describes predefined aggregators and a simple mechanism to create your own. Examples of aggregators are: count, largest value, mean...

Aggregator groups and aggregators

A property like the number of elements or the number of elements that are not missing in a series is meaningful for all series regardless of their element type. The result is always an integer. A property like the first element is also universally relevant, but the type of the result depends on the element type. More than that, the type of the result is equal to the element type. These two examples illustrate the two main kinds of aggregators: those where the result is of a specific type, and those where the result type is the same as the element type.

The most common aggregators are defined as static properties or methods of the Aggregators class, and are listed below.

Member	Description
Count	Computes the number of non-missing values.
Max	Returns the largest value.
Min	Returns the smallest value.
First	Returns the first non-missing value.
Last	Returns the last non-missing value.
Skip(Int32)	Returns the first non-missing value after skipping the specified number of non-missing values.
Mean	Returns the mean of the non-missing values.
Sum	Returns the sum of the non-missing values.
Variance	Returns the sample variance of the non-missing values.
StandardDeviation	Returns the sample standard deviation of the non-missing values.
Skewness	Returns the skewness of the non-missing values.
Kurtosis	Returns the kurtosis of the non-missing values.
Median	Returns the median of the non-missing values.
FirstQuartile	Returns the 25% quantile of the non-missing values.
ThirdQuartile	Returns the 75% quantile of the non-missing values.
Quantile(Double)	Returns the quantile at the specified probability of the non-missing values.
Sum	Returns the sum of the non-missing values preserving the element type.

The above properties and methods return an aggregator group. An aggregator group represents an aggregation operation in a way that is independent of the element type of the series to which it is applied. When the aggregator group is applied to a series, an aggregator function for the element type of the series is selected to perform the actual operation.

AggregatorGroup is an abstract class that represents an aggregator group. It has two descendant classes corresponding to the two kinds of aggregators mentioned above: AggregatorGroup<TResult> represents an aggregator group where the generic type argument is the return type, and TypePreservingAggregatorGroup represents an aggregator group where the return type is the same as the element type.

By default, aggregators that compute numerical descriptive statistics, like Mean and Variance return a Double value regardless of the element type of the input. In fact, the input is first converted to Double and the aggregation is computed for the converted values.

This is useful in situations in many situations, for example, when computing descriptive statistics for sets of integers. In other situations, we may want a different intermediate type. For example, converting decimals to double in general results in a small round-off error. We may want to maintain the full accuracy and precision of decimal values.

There are two ways to achieve this. Some aggregators, like Sum, have a variant prefixed with Exact. These aggregators produce a result of the same type as the input.

Aggregator groups and aggregators have an As<TNewResult>() method that returns an aggregator that uses a different type (supplied as the type argument to the method) for the intermediate values and the result.

For example, to compute the mean of a set of integers to quadruple precision, you can use:

int[] n = { 1, 4, 9 };
var avg = Aggregators.Mean.As<Quad>().Aggregate(n);
// avg = 4.666666666666666666666666666666667

Visual Basic

Dim n As Integer() = { 1, 4, 9 }
Dim avg = Aggregators.Mean.As(Of Quad).Aggregate(n)
' avg = 4.666666666666666666666666666666667

Visual Basic

No code example is currently available or this language may not be supported.

let n = [| 1; 4; 9 |];
let avg = Aggregators.Mean.As<Quad>().Aggregate(n)
// avg = 4.666666666666666666666666666666667

Aggregating over multiple series

Some aggregations, such as the correlation between two variables, involve more than one series. The Aggregators class defines a number of aggregators that operate on more than one series:

Member	Description
Correlation	Computes the correlation between two sets of values.
Covariance	Computes the covariance between two sets of values.
DotProduct	Computes the dot product between two sets of values.
WeightedMean	Computes the mean of one set of values weighted by another.
WeightedStandardDeviation	Computes the standard deviation of one set of values weighted by another.
WeightedVariance	Computes the variance of one set of values weighted by another.

User-defined aggregators

The simplest way to define an aggregator that is not available from the Aggregators class is to use the Aggregators.Create method. This method takes a delegate that maps a vector to the aggregated value. The following creates an aggregator that computes the geometric mean:

var agg = Aggregators.Create(
    (Vector<double> x) => Stats.GeometricMean(x));

Visual Basic

Dim agg = Aggregators.Create(
    Function(x As Vector(Of Double)) Stats.GeometricMean(x))

Visual Basic

No code example is currently available or this language may not be supported.

let agg = Aggregators.Create(fun (x : Vector<float>) -> Stats.GeometricMean(x))