Sparse Matrices
Sparse matrices are represented by the SparseMatrix<T> class that defines some additional methods and properties common to all sparse matrices. Currently, only matrices in Compressed Sparse Column (CSC) format are available.
How sparse matrices are stored
There are many representations for sparse matrices. Each has its advantages and disadvantages. Each has a set of applications for which it is particularly well suited. The only sparse storage format currently supported is Compressed Sparse Column (CSC) format. The elements are stored in column-major order. Only the indexes and values of the nonzero elements are stored. It consists of four arrays:
An array, values, containing the values of the nonzero matrix elements.
An array, rows, containing the row indices of the elements of values.
An array, pointerB, of length ColumnCount containing the index into values and rows where the ith column begins.
An array, pointerE, of length ColumnCount containing the index into values and rows where the ith column ends.
Sparse matrices in CSC format are represented by the SparseCompressedColumnMatrix<T> class. Other representations (compressed sparse row format, coordinate format, sparse diagonal, block sparse row format) may be supported in the future.
Constructing sparse matrices
The SparseCompressedColumnMatrix<T> class has no public constructors. Instead, use an overload of the CreateSparse method to construct a new sparse matrix. This method has four overloads.
The first overload has two arguments: the number of rows and the number of columns in the matrix. This creates a sparse matrix with the specified number of rows and columns. All elements are initially set to zero. The element type must be specified as a generic type argument.
The second overload adds a third argument: a real number between 0 and 1 that specifies the fill factor. This is the proportion of nonzero elements compared to the total number of elements in the matrix. The following creates a sparse matrix, and indicates that only 1 in 500 elements are nonzero:
The third overload is similar, but takes as its third argument the expected number of nonzero elements. This is useful in situations where this number is known. Note that this number is only the initial capacity. Should the number of nonzero elements grow, then more space will be allocated. However, this is a relatively expensive operation, so it is generally preferable to overestimate this number rather than risk a reallocation. The example below illustrates three ways of creating a sparse matrix with 30,000 rows and 40,000 columns:
var m1 = Matrix.CreateSparse<double>(30000, 40000);
var m2 = Matrix.CreateSparse<double>(30000, 40000, 0.002);
var m3 = Matrix.CreateSparse<double>(30000, 40000, 77777);
The three constructors so far construct a sparse matrix with no actual elements set. The next section outlines ways of adding the nonzero elements.
The fourth overload creates a sparse matrix and fills it with the specified elements. It takes 5 arguments in all. The first two are once again the number of rows and columns. The third and fourth arguments are integer arrays that specify the row and column indexes of the nonzero elements. The fifth argument is an array that specifies the corresponding values. All three arrays must have the same number of elements. They do not need to be sorted in any particular order. They may also contain duplicate row and column indexes. If the same row and column occur multiple times, the value of the element is the sum of the corresponding values:
var m1 = Matrix.CreateSparse<double>(30000, 40000);
var m2 = Matrix.CreateSparse<double>(30000, 40000, 0.002);
var m3 = Matrix.CreateSparse<double>(30000, 40000, 77777);
Sparse symmetric matrices can be created in a similar way, using the CreateSparseSymmetric method:
var m1 = Matrix.CreateSparse<double>(30000, 40000);
var m2 = Matrix.CreateSparse<double>(30000, 40000, 0.002);
var m3 = Matrix.CreateSparse<double>(30000, 40000, 77777);
Building sparse matrices
All but one overload in the previous section creates an empty sparse matrix. This section shows how to insert nonzero elements into the matrix. The SparseMatrix<T> class has a number of methods to simplify this process. All these methods start with the prefix Insert. The behavior complies with the Sparse BLAS specification.
The simplest method is InsertEntry, which inserts a single element into the matrix. It has three arguments. The first is the value of the element. The second and third arguments are the row and column. Any existing nonzero value is overwritten with the new value.
Up one level is the InsertEntries method. This method inserts multiple elements at once. It also has three arguments. The first is an array containing the values of the elements. The second and third arguments are integer arrays that specify the row and column where each value is to be inserted.
Any existing nonzero value is overwritten with the new value. This also means that when a row column pair occurs multiple times, the element's value equals the last value that was encountered.
The InsertRow method creates entries in a single row. The first argument is the row index. The second argument is an array containing the values. The third argument is an integer array containing the column indexes. To insert values in a column, there is the analogous InsertColumn method.
m1.InsertEntry(25.0, 200, 500);
m1.InsertEntries(values, rowIndexes, columnIndexes);
m1.InsertColumn(33, values, rowIndexes);
m1.InsertRow(55, values, columnIndexes);
Finally, the InsertClique method inserts a clique. A clique is a two-dimensional array of values, together with arrays of row and column indexes that specify where in the sparse matrix the entries will be inserted. The method has three arguments. The first is a matrix that contains the values. The second and third arguments are integer arrays that contain row and column indexes.
var clique = Matrix.Create(new[,] { { 11.0, 12.0 }, { 21.0, 22.0 } });
var cliqueRows = new int[] { 5, 8 };
var cliqueColumns = new int[] { 5, 8 };
m1.InsertClique(clique, cliqueRows, cliqueColumns);
Importing sparse matrices
Several common formats exist to exchange sparse matrices between applications. The Matrix Market format was inspired by the Matrix Market, an online repository of sparse test matrices and problems. You can import Matrix Market files into sparse matrices using the MatrixMarketFile class. This class has one static method, ReadMatrix, which has three overloads. Each of these takes one argument. The first takes a string that is the path to the file name containing the data. The second takes a System.IO.TextReader, and the third takes a System.IO.Stream.
Accessing Sparse Matrix Elements
The mechanisms for accessing elements available to dense and structured matrices are also available for sparse matrices. Note, however, that accessing individual elements is relatively slow, and should be avoided, if at all possible.
The most efficient way to access a SparseCompressedColumnMatrix<T>'s elements is column by column. The GetColumn method returns a sparse Vector<T> that can access the elements of the matrix efficiently.
It is possible to iterate over the nonzero elements of a sparse matrix efficiently. The NonzeroElements property returns a collection of RowColumnValueTriplet<T> values that can be used to iterate over the nonzero elements. The RowColumnValueTriplet<T> structure has three read-only fields: Row, Column, and Value that contain the row and column indexes, and the value of each successive nonzero, respectively.
The code below uses the NonzeroElements property to list the nonzero elements of the matrix m3:
foreach (var triplet in m3.NonzeroElements)
Console.WriteLine("Component ({0},{1}) = {2}",
triplet.Row, triplet.Column, triplet.Value);
There is also a GetNonzeroElements method that returns the nonzero elements in arrays. The method returns an array containing the values of the nonzero elements. The method takes two out arguments. Both are integer arrays that, on return, contain the corresponding row and column indexes of the nonzero elements.
Operations On Sparse Matrices
All operations that are available for dense and structured matrices are also available for sparse matrices. However, these operations may not always be efficient, and may even destroy sparsity and the advantages that come with it. Worse still, inappropriate use of sparse matrices may make code using sparse matrices perform orders of magnitude slower than their equivalent dense counterparts.
The key to working with sparse matrices is realizing that operations on individual elements are very expensive. Finding the value of an element usually entails some form of lookup.
Operations on individual elements
Even though operations on individual elements of a sparse matrix are discouraged, they may at times be necessary. To avoid looking up the storage location of the element twice, a number of methods have been defined that operate directly on an element:
Method | Description |
---|---|
Adds a value to an element. | |
Subtracts a value from an element. | |
Multiplies an element by a value. | |
Divides an element by a value. |
Matrix arithmetic
The most important question regarding sparse matrix arithmetic is whether the return value is sparse or dense. Element-wise operations, such as addition, subtraction, element-wise multiplication and division, maintain sparsity. Multiplication of two sparse matrices returns a sparse matrix. All other operations return a dense matrix.
The solution of a system of equations is always dense, even if both the matrix and the right-hand side are sparse.
Other operations such as the computation of eigenvalues and singular values currently use a dense algorithm.