Independent Component Analysis
Independent Component Analysis (ICA) is a computational technique for separating a multivariate signal into additive subcomponents that are statistically independent and non-Gaussian. Unlike Principal Component Analysis (PCA), which finds uncorrelated components that maximize variance, ICA finds components that are statistically independent, making it particularly useful for blind source separation problems such as separating mixed audio signals or extracting features from multi-sensor data.
Defining ICA models
All classes related to Independent Component Analysis reside in the Numerics.NET.Statistics.Multivariate namespace. The main type is FastIca, which represents a FastICA analysis.
The FastIca class has multiple constructors. The first constructor takes one argument: a Matrix<T> whose columns contain the mixed signals to be separated. The second constructor also takes one argument: an array of Vector<T> objects.
var matrix = Matrix.CreateRandom(100, 10);
var ica1 = new FastIca(matrix);
var vectors = matrix.Columns.ToArray();
var ica2 = new FastIca(vectors);
The third constructor takes two arguments. The first is a IDataFrame (a DataFrame<R, C> or Matrix<T>) that contains the variables that may be used in the analysis. The second argument is an array of strings that contains the names of the variables from the collection that should be included in the analysis. The example creates a data frame from a matrix and then constructs a FastICA object using a subset of columns:
var rowIndex = Index.Default(matrix.RowCount);
var allNames = new string[] { "x1", "x2", "x3",
"x4", "x5", "x6", "x7", "x8", "x9", "x10" };
var columnIndex = Index.Create(allNames);
var dataFrame = matrix.ToDataFrame(rowIndex, columnIndex);
var names = new string[] { "x1", "x2", "x3", "x8", "x9", "x10" };
var ica3 = new FastIca(dataFrame, names);
Configuring the analysis
Before running the analysis, several properties can be configured to control the behavior of the FastICA algorithm. The most important property is NumberOfComponents, which specifies how many independent components to extract. If not set or set to -1, all possible components are extracted (up to the minimum of the number of samples and features).
The ContrastFunction property determines which nonlinearity function is used in the fixed-point iteration. This value is of type FastIcaContrastFunction which can take on the following values:
Value | Description |
---|---|
LogCosh | Uses the hyperbolic tangent function. This is the default and works well for most types of source distributions, both sub-Gaussian and super-Gaussian. It is robust to outliers. |
Exponential | Uses an exponential function. Optimized for super-Gaussian (heavy-tailed) distributions but more sensitive to outliers. |
Cubic | Uses a cubic polynomial. Simple and fast, works well for sub-Gaussian (light-tailed) distributions but is less robust to outliers. |
The Method property controls whether components are extracted in parallel (all at once) or sequentially using deflation. The parallel method is generally faster and is the default. The Whiten property controls the whitening (pre-processing) strategy. The default is UnitVariance, which ensures the extracted sources have unit variance.
The Fit method performs the actual calculations. The code below sets the number of components and contrast function for the FastICA object created earlier and runs the analysis:
ica3.NumberOfComponents = 3;
ica3.ContrastFunction = FastIcaContrastFunction.LogCosh;
ica3.Fit();
Results of the Analysis
Once the computations are complete, a number of properties and methods give access to the results in detail. The Components property (also accessible as UnmixingMatrix) returns the unmixing matrix whose rows are the weight vectors for extracting the independent components. The MixingMatrix property returns the mixing matrix that represents how the independent sources are combined to produce the observed signals.
To extract the independent sources from the training data, use the Transform(Matrix<UTP>) method. This applies the unmixing transformation to the input data to produce the estimated sources. The code below illustrates extracting sources and examining the mixing matrix:
var sources = ica3.Transform(matrix);
Console.WriteLine("Extracted {0} independent components", sources.ColumnCount);
var mixingMatrix = ica3.MixingMatrix;
Console.WriteLine("Mixing matrix:");
for (int i = 0; i < mixingMatrix.RowCount; i++)
{
Console.Write(" ");
for (int j = 0; j < mixingMatrix.ColumnCount; j++)
Console.Write("{0,8:F4}", mixingMatrix[i, j]);
Console.WriteLine();
}
The InverseTransform(Matrix<UTP>) method can be used to reconstruct the mixed signals from the estimated sources. This is useful for verifying the quality of the separation or for reconstructing signals using only a subset of the components. The sample code below shows how to extract sources and then reconstruct the original mixed signals:
var reconstructed = ica3.InverseTransform(sources);
Console.WriteLine("Reconstruction error: {0:E4}",
Matrix.Subtract(reconstructed, matrix).FrobeniusNorm());
Example: Separating Mixed Signals
A typical application of ICA is the cocktail party problem: separating multiple audio sources that have been mixed together. In this example, we simulate three source signals (a sinusoid, a block wave, and a sawtooth wave), mix them together using a random mixing matrix, and then use FastICA to recover the original sources.
// Generate three source signals: sinusoid, block wave, sawtooth
int nSamples = 1000;
var sourceMatrix = Matrix.Create<double>(nSamples, 3);
for (int i = 0; i < nSamples; i++)
{
double t = (double)i / nSamples;
// Sinusoid
sourceMatrix[i, 0] = Math.Sin(2 * Math.PI * 5 * t);
// Block wave
sourceMatrix[i, 1] = Math.Sign(Math.Sin(2 * Math.PI * 3 * t));
// Sawtooth
sourceMatrix[i, 2] = 2 * (t * 7 - Math.Floor(t * 7 + 0.5));
}
// Mix the signals with a random mixing matrix
var mixingMatrixTrue = Matrix.CreateRandom(3, 3);
var mixedSignals = sourceMatrix * mixingMatrixTrue.Transpose();
// Apply FastICA to separate the mixed signals
var ica = new FastIca(mixedSignals);
ica.NumberOfComponents = 3;
ica.Fit();
var separatedSources = ica.Transform(mixedSignals);
Console.WriteLine("Successfully separated {0} signals from {1} mixed observations",
separatedSources.ColumnCount, mixedSignals.ColumnCount);