Sparse Matrices in IronPython QuickStart Sample

Illustrates using sparse vectors and matrices using the classes in the Numerics.NET.LinearAlgebra.Sparse namespace in IronPython.

This sample is also available in: C#, Visual Basic, F#.

Overview

This QuickStart sample demonstrates how to work with sparse vectors and matrices in Numerics.NET using the classes in the LinearAlgebra.Sparse namespace.

Sparse vectors and matrices are data structures optimized for situations where most elements are zero. Instead of storing all elements, they only store the non-zero values and their positions. This makes them memory efficient for large but sparse data.

The sample shows:

Different ways to create sparse vectors and matrices
How to specify initial capacity and fill factors
Methods to insert individual elements and groups of elements
Accessing elements and iterating over non-zero entries
Working with rows, columns and submatrices (cliques)
Mathematical operations optimized for sparse structures
Performance considerations when working with large sparse matrices

The code includes examples of both successful operations and potential pitfalls, such as operations that may revert to dense algorithms and fail on very large matrices.

The code

import numerics

from System import Array

from Extreme.Mathematics import *
# The sparse vector and matrix classes reside in the 
# Extreme.Mathematics.LinearAlgebra namespace.
from Extreme.Mathematics.LinearAlgebra import *
    
# Illustrates using sparse vectors and matrices using the classes
# in the Extreme.Mathematics.LinearAlgebra.Sparse namespace 
# of Extreme Numerics.NET.

#
# Sparse vectors
#

# The SparseVector class has three constructors. The
# first simply takes the length of the vector. All components
# are initially zero.
v1 = Vector.CreateSparse(1000000)

# The second constructor lets you specify how many components
# are expected to be nonzero. This 'fill factor' is a number
# between 0 and 1.
v2 = Vector.CreateSparse(1000000, 1e-4)

# The second constructor lets you specify how many components
# are expected to be nonzero. This 'fill factor' is a number
# between 0 and 1.
v3 = Vector.CreateSparse(1000000, 100)

# The fourth constructor lets you specify the indexes of the nonzero
# components and their values as arrays:
indexes = Array[int]([ 1, 10, 100, 1000, 10000 ])
values = Array[float]([ 1.0, 10.0, 100.0, 1000.0, 10000.0 ])
v4 = Vector.CreateSparse(1000000, indexes, values)

# Components can be accessed individually:
v1[1000] = 2.0
print "v1[1000] =", v1[1000]

# The NonzeroCount returns how many components are non zero:
print "v1 has", v1.NonzeroCount, "nonzeros"
print "v4 has", v4.NonzeroCount, "nonzeros"

# The NonzeroComponents property returns a collection of 
# IndexValuePair structures that you can use to iterate
# over the components of the vector:
print "Nonzero components of v4:"
for pair in v4.NonzeroComponents:
    print "Component {0} = {1}".format(pair.Index, pair.Value)

# All other vector methods and properties are also available, # Their implementations take advantage of sparsity.
print "Norm(v4) =", v4.Norm()
print "Sum(v4) =", v4.GetSum()
            
# Note that some operations convert a sparse vector to a
# DenseVector, causing memory to be allocated for all
# components.
            
#
# Sparse Matrices
#

# All sparse matrix classes inherit from SparseMatrix. This is an abstract class.
# There currently is only one implementation class:
# SparseCompressedColumnMatrix. 

# Sparse matrices are created by calling the CreateSparse factory
# method on the Matrix class. It has 4 overloads:

# The first overload takes the number of rows and columns as arguments:
m1 = Matrix.CreateSparse(100000, 100000)

# The second overload adds a fill factor:
m2 = Matrix.CreateSparse(100000, 100000, 1e-5)

# The third overload uses the actual number of nonzero components rather than 
# the fraction:
m3 = Matrix.CreateSparse(10000, 10000, 20000)

# The fourth overload lets you specify the locations and values of the
# nonzero components:
rows = Array[int]([ 1, 11, 111, 1111 ])
columns = Array[int]([ 2, 22, 222, 2222 ])
matrixValues = Array[float]([  3.0, 33.0, 333.0, 3333.0 ])
m4 = Matrix.CreateSparse(10000, 10000, rows, columns, matrixValues)

# You can access components as before...
print "m4[111, 222] =", m4[111, 222]
m4[99, 22] = 99.0

# A series of Insert methods lets you build a sparse matrix from scratch:
# A single value:
m1.InsertEntry(25.0, 200, 500)
# Multiple values:
m1.InsertEntries(matrixValues, rows, columns)
# Multiple values all in the same column:
m1.InsertColumn(33, values, indexes)
# Multiple values all in the same row:
m1.InsertRow(55, values, indexes)

# A clique is a 2-dimensional submatrix with indexed rows and columns.
clique = Matrix([[11, 12], [21, 22]])
cliqueIndexes = Array[int]([ 5, 8 ])
m1.InsertClique(clique, cliqueIndexes, cliqueIndexes)

# You can use the NonzeroComponents collection to iterate
# over the nonzero components of the matrix. The items
# are of type RowColumnValueTriplet:
print "Nonzero components of m1:"
for triplet in m1.NonzeroComponents:
    print "m1[{0},{1}] = {2}".format(triplet.Row, triplet.Column, triplet.Value)

# ... including rows and columns.
column = m4.GetColumn(22)
print "Nonzero components in column 22 of m4:"
for pair in column.NonzeroComponents:
    print "Component {0} = {1}".format(pair.Index, pair.Value)

# Many matrix methods have been optimized to take advantage of sparsity:
print "F-norm(m1) =", m1.FrobeniusNorm()
            
# But beware: some revert to a dense algorithm and will fail on huge matrices:
try:
    inverse = m1.GetInverse()
except MemoryError as e:
    print e.message