Contingency Tables

Contingency tables, also known as cross-tabulation tables, are used to display the frequency distribution of variables. They help in understanding the relationship between two categorical variables by showing the counts of their combinations. Applications include statistical analysis, hypothesis testing, and data visualization in fields like epidemiology, market research, and social sciences.

In Numerics.NET, contingency tables are implemented by the ContingencyTable class. It supports both 2x2 and RxC tables. It provides properties to access computed statistics and methods to perform various hypothesis tests.

Constructing Contingency Tables

The ContingencyTable class can be created using several constructors that allow you to specify the row and column variables, and optionally, the count variable.

The first constructor allows you to create a new contingency table for the specified row and column variables. This is useful when you have categorical data for rows and columns and want to analyze their relationship.

// Create categorical vectors for rows and columns  
var rowVariable1 = Vector.CreateCategorical(new[] { "A", "B", "A", "B" });
var columnVariable1 = Vector.CreateCategorical(new[] { "X", "X", "Y", "Y" });

// Create a contingency table  
var table1 = new ContingencyTable(rowVariable1, columnVariable1);

Visual Basic

' Create categorical vectors for rows and columns  
Dim rowVariable1 = Vector.CreateCategorical(New String() { "A", "B", "A", "B" })
Dim columnVariable1 = Vector.CreateCategorical(New String() { "X", "X", "Y", "Y" })

' Create a contingency table  
Dim table1 = New ContingencyTable(rowVariable1, columnVariable1)

Visual Basic

No code example is currently available or this language may not be supported.

// Create categorical vectors for rows and columns  
let rowVariable1 = Vector.CreateCategorical([| "A"; "B"; "A"; "B" |])
let columnVariable1 = Vector.CreateCategorical([| "X"; "X"; "Y"; "Y" |])

// Create a contingency table  
let table1 = new ContingencyTable(rowVariable1, columnVariable1);

The second constructor allows you to create a new contingency table for the specified row and column variables, and the count variable. This is useful when you have additional count data that you want to include in the analysis.

// Create categorical vectors for rows and columns  
var rowVariable2 = Vector.CreateCategorical(new[] { "A", "B", "A", "B" });
var columnVariable2 = Vector.CreateCategorical(new[] { "X", "X", "Y", "Y" });

// Create a count vector  
var countVariable = Vector.Create(new double[] { 1, 2, 3, 4 });

// Create a contingency table  
var table2 = new ContingencyTable(rowVariable2, columnVariable2, countVariable);

Visual Basic

' Create categorical vectors for rows and columns  
Dim rowVariable2 = Vector.CreateCategorical(New String() { "A", "B", "A", "B" })
Dim columnVariable2 = Vector.CreateCategorical(New String() { "X", "X", "Y", "Y" })

' Create a count vector  
Dim countVariable = Vector.Create(New Double() { 1, 2, 3, 4 })

' Create a contingency table  
Dim table2 = New ContingencyTable(rowVariable2, columnVariable2, countVariable)

Visual Basic

No code example is currently available or this language may not be supported.

// Create categorical vectors for rows and columns  
let rowVariable2 = Vector.CreateCategorical([| "A"; "B"; "A"; "B" |])
let columnVariable2 = Vector.CreateCategorical([| "X"; "X"; "Y"; "Y" |])

// Create a count vector  
let countVariable = Vector.Create([| 1.0; 2.0; 3.0; 4.0 |])

// Create a contingency table  
let table2 = new ContingencyTable(rowVariable2, columnVariable2, countVariable)

The third and fourth constructors allow you to create a new contingency table from counts stored in a matrix. You can optionally specify the row and column indexes.

// Create a matrix for counts  
Matrix<double> counts1 = Matrix.Create(new double[,] { { 1, 2 }, { 3, 4 } });

// Create row and column scales  
var rowIndex = Numerics.NET.DataAnalysis.Index.Create(new[] { "A", "B" });
var columnIndex = Numerics.NET.DataAnalysis.Index.Create(new[] { "X", "Y" });

// Create a contingency table  
ContingencyTable table3 = new ContingencyTable(counts1, rowIndex, columnIndex);

Visual Basic

' Create a matrix for counts  
Dim counts1 = Matrix.Create(New Double(,) { { 1, 2 }, { 3, 4 } })

' Create row and column scales  
Dim rowIndex = DataAnalysis.Index.Create(New String() { "A", "B" })
Dim columnIndex = DataAnalysis.Index.Create(New String() { "X", "Y" })

' Create a contingency table  
Dim table3 = New ContingencyTable(counts1, rowIndex, columnIndex)

Visual Basic

No code example is currently available or this language may not be supported.

// Create a matrix for counts  
let counts1 = Matrix.Create(array2D [ [ 1.0; 2.0 ]; [ 3.0; 4.0 ] ])

// Create row and column scales  
let rowIndex = Numerics.NET.DataAnalysis.Index.Create([| "A"; "B" |])
let columnIndex = Numerics.NET.DataAnalysis.Index.Create([| "X"; "Y" |])

// Create a contingency table  
let table3 = new ContingencyTable(counts1, rowIndex, columnIndex);

Properties

The ContingencyTable class provides several properties to access computed statistics and other information about the contingency table structure and its measures of association.

The structural properties include RowCount and ColumnCount, which return the number of categories in the first and second variables respectively, while TotalCount returns the total sample size as the sum of all frequencies.

For measuring associations between variables, the class provides several statistical properties: ChiSquare returns the overall association measure, Phi provides a coefficient specifically for 2x2 tables, CoefficientOfContingency returns Pearson's measure of association strength, and CramerV offers a measure suitable for tables of any size.

The following code examples demonstrate how to access these properties and interpret their values in practice. Note that some measures of association are only meaningful for specific table dimensions.

// Access properties
double chiSquare = table2.ChiSquare;
double phi = table2.Phi;
double coefficientOfContingency = table2.CoefficientOfContingency;
double cramerV = table2.CramerV;

// Display results
Console.WriteLine($"Chi-Square: {chiSquare}");
Console.WriteLine($"Phi: {phi}");
Console.WriteLine($"Coefficient of Contingency: {coefficientOfContingency}");
Console.WriteLine($"Cramer's V: {cramerV}");

Visual Basic

' Access properties
Dim chiSquare As Double = table2.ChiSquare
Dim phi As Double = table2.Phi
Dim coefficientOfContingency As Double = table2.CoefficientOfContingency
Dim cramerV As Double = table2.CramerV

' Display results
Console.WriteLine($"Chi-Square: {chiSquare}")
Console.WriteLine($"Phi: {phi}")
Console.WriteLine($"Coefficient of Contingency: {coefficientOfContingency}")
Console.WriteLine($"Cramer's V: {cramerV}")

Visual Basic

No code example is currently available or this language may not be supported.

// Access properties
let  chiSquare = table2.ChiSquare
let  phi = table2.Phi
let  coefficientOfContingency = table2.CoefficientOfContingency
let  cramerV = table2.CramerV

// Display results
printfn "Chi-Square: %f" chiSquare
printfn "Phi: %f" phi
printfn "Coefficient of Contingency: %f" coefficientOfContingency
printfn "Cramer's V: %f" cramerV

Accessing Cells and Totals

The ContingencyTable class defines indexers that can be used to access the cell frequencies and totals in the table. There are two overloads of the indexer: one that takes the (numeric) row and column indexes, starting at 0, and another that takes the row and column category values. The row and column totals are simply extra rows and columns in the table. To access row or column totals, set the index of the row or column to 1 past the last legal index.

The indexers return a structure of type ContingencyTableCell, which has the following properties:

Property	Description
RowLevel	The level (category) of the row containing the cell.
ColumnLevel	The level (category) of the column containing the cell.
Count	The number of observations in the cell.
RelativeFrequency	The frequency of the cell relative to all other cells in the contingency table.
RelativeFrequencyInRow	The frequency of the cell relative to the other cells in the same row of the contingency table.
RelativeFrequencyInColumn	The frequency of the cell relative to the other cells in the same column of the contingency table.
RelativePercentage	The relative percentage of the cell in the ContingencyTableCell.
RelativePercentageInRow	The percentage of the cell relative to the other cells in the same row of the contingency table.
RelativePercentageInColumn	The percentage of the cell relative to the other cells in the same column of the contingency table.
ExpectedCount	The expected count of the cell based on the row and column totals of the contingency table.
ChiSquare	The contribution of the cell to the Chi-square statistic of the contingency table.
Residual	The difference between the expected and the actual count in the cell of the contingency table.
StandardizedResidual	The standardized difference between the expected and the actual count in the cell of the contingency table.
AdjustedStandardizedResidual	The standardized difference between the expected and the actual count in the cell of the contingency table, adjusted for the row and column totals.

The code below demonstrates how to access the cell frequencies and totals using these indexers:

// Using numerical index:
var cell01 = table2[0, 1];
// Using category names:
var cell10 = table2["A", "X"];
// Total for row 0:
var cell0_ = table2[0, table2.ColumnCount];
// Total for column 1:
var cell_1 = table2[table2.RowCount, 1];
// Total for the table:
var cell__ = table2[table2.RowCount, table2.ColumnCount];

// Display results
Console.WriteLine($"Cell (0, 1):");
Console.WriteLine($"  Expected: {cell01.ExpectedCount}");
Console.WriteLine($"  Actual:   {cell01.Count}");
Console.WriteLine($"  Relative frequency:");
Console.WriteLine($"    In row:   {cell01.RelativeFrequencyInRow}");
Console.WriteLine($"    In column:{cell01.RelativeFrequencyInColumn}");
Console.WriteLine($"    In table: {cell01.RelativeFrequency}");

Visual Basic

' Using numerical index:
Dim cell01 = table2(0, 1)
' Using category names:
Dim cell10 = table2("A", "X")
' Total for row 0:
Dim cell0_ = table2(0, table2.ColumnCount)
' Total for column 1:
Dim cell_1 = table2(table2.RowCount, 1)
' Total for the table:
Dim cell__ = table2(table2.RowCount, table2.ColumnCount)

' Display results
Console.WriteLine($"Cell (0, 1):")
Console.WriteLine($"  Expected: {cell01.ExpectedCount}")
Console.WriteLine($"  Actual:   {cell01.Count}")
Console.WriteLine($"  Relative frequency:")
Console.WriteLine($"    In row:   {cell01.RelativeFrequencyInRow}")
Console.WriteLine($"    In column:{cell01.RelativeFrequencyInColumn}")
Console.WriteLine($"    In table: {cell01.RelativeFrequency}")

Visual Basic

No code example is currently available or this language may not be supported.

// Using numerical index:
let cell01 = table2[0, 1]
// Using category names:
let cell10 = table2["A", "X"]
// Total for row 0:
let cell0_ = table2[0, table2.ColumnCount]
// Total for column 1:
let cell_1 = table2[table2.RowCount, 1]
// Total for the table:
let cell__ = table2[table2.RowCount, table2.ColumnCount]

// Display results

// Display results
printfn "Cell (0, 1):"
printfn "  Expected: %f" cell01.ExpectedCount
printfn "  Actual:   %f" cell01.Count
printfn "  Relative frequency:"
printfn "    In row:   %f" cell01.RelativeFrequencyInRow
printfn "    In column:%f" cell01.RelativeFrequencyInColumn
printfn "    In table: %f" cell01.RelativeFrequency

Hypothesis Tests

The ContingencyTable class provides several methods to compute various statistics and perform hypothesis tests. These tests help in determining the statistical significance of the observed relationships in the contingency table.

The Chi-Square test is used to determine if there is a significant association between the row and column variables. It compares the observed frequencies in the contingency table to the expected frequencies if the variables were independent. This test is performed by the GetChiSquareTest method.

The Yates-corrected Chi-Square test is a modification of the Chi-Square test that adjusts for continuity. This test is particularly useful for small sample sizes to reduce the bias in the test statistic. This test is performed by the GetYatesCorrectedChiSquareTest method.

The likelihood ratio test is another method to test the independence of the row and column variables. It compares the likelihood of the observed data under the null hypothesis (independence) to the likelihood under the alternative hypothesis (dependence). This test is performed by the GetLikelihoodRatioTest method.

The Mantel-Haenszel Chi-Square test is used to assess the association between two categorical variables while controlling for one or more other variables. It is commonly used in stratified analysis. This test is performed by the GetMantelHaenszelTest method.

Fisher's exact test is used to determine if there are nonrandom associations between two categorical variables. This test is particularly useful for small sample sizes and 2x2 tables. This test is performed by the GetFisherExactProbability() method.

var chiSquareTest = table2.GetChiSquareTest();
Console.WriteLine($"Chi-Square Test: {chiSquareTest.Summarize()}");

var yatesCorrectedChiSquareTest = table2.GetYatesCorrectedChiSquareTest();
Console.WriteLine($"Yates Corrected Chi-Square Test: {yatesCorrectedChiSquareTest.Summarize()}");

var likelihoodRatioTest = table2.GetLikelihoodRatioTest();
Console.WriteLine($"Likelihood Ratio Test: {likelihoodRatioTest.Summarize()}");

var mantelHaenszelTest = table2.GetMantelHaenszelTest();
Console.WriteLine($"Mantel-Haenszel Test: {mantelHaenszelTest.Summarize()}");

double fisherExactProbability = table2.GetFisherExactProbability();
Console.WriteLine($"Fisher Exact Probability: {fisherExactProbability}");

Visual Basic

Dim chiSquareTest = table2.GetChiSquareTest()
Console.WriteLine($"Chi-Square Test: {chiSquareTest.Summarize()}")

Dim yatesCorrectedChiSquareTest = table2.GetYatesCorrectedChiSquareTest()
Console.WriteLine($"Yates Corrected Chi-Square Test: {yatesCorrectedChiSquareTest.Summarize()}")

Dim likelihoodRatioTest = table2.GetLikelihoodRatioTest()
Console.WriteLine($"Likelihood Ratio Test: {likelihoodRatioTest.Summarize()}")

Dim mantelHaenszelTest = table2.GetMantelHaenszelTest()
Console.WriteLine($"Mantel-Haenszel Test: {mantelHaenszelTest.Summarize()}")

Dim fisherExactProbability As Double = table2.GetFisherExactProbability()
Console.WriteLine($"Fisher Exact Probability: {fisherExactProbability}")

Visual Basic

No code example is currently available or this language may not be supported.

let chiSquareTest = table2.GetChiSquareTest()
printfn "Chi-Square Test: %s" (chiSquareTest.Summarize())

let yatesCorrectedChiSquareTest = table2.GetYatesCorrectedChiSquareTest()
printfn "Yates Corrected Chi-Square Test: %s" (yatesCorrectedChiSquareTest.Summarize())

let likelihoodRatioTest = table2.GetLikelihoodRatioTest()
printfn "Likelihood Ratio Test: %s" (likelihoodRatioTest.Summarize())

let mantelHaenszelTest = table2.GetMantelHaenszelTest()
printfn "Mantel-Haenszel Test: %s" (mantelHaenszelTest.Summarize())

let fisherExactProbability = table2.GetFisherExactProbability()
printfn "Fisher Exact Probability: %f" fisherExactProbability

Contingency Tables

Constructing Contingency Tables

Properties

Accessing Cells and Totals

Hypothesis Tests

Related Topics