Histograms in IronPython QuickStart Sample

Illustrates how to create histograms using the Histogram class in the Numerics.NET.DataAnalysis namespace in IronPython.

This sample is also available in: C#, Visual Basic, F#.

Overview

This QuickStart sample demonstrates how to create and work with histograms using the Histogram class in Numerics.NET.

The sample shows several different ways to create histograms:

  • Creating histograms with evenly spaced bins by specifying bounds and bin count
  • Creating histograms with explicitly specified bin boundaries
  • Creating histograms using Index objects to define bins
  • Creating histograms for categorical data

It demonstrates key histogram operations including:

  • Adding data to histograms using the Tabulate method
  • Incrementing individual bins
  • Accessing bin information like bounds and counts
  • Finding which bin contains a specific value
  • Iterating through bins and their values
  • Working with both numerical and categorical data

The sample uses a practical example of analyzing test scores to illustrate these concepts. It shows how to create different binning strategies and how to work with the resulting histogram data structures.

The code

import numerics

from System import Array

from Extreme.Mathematics import *
from Extreme.Statistics import *

# Illustrates the use of the Histogram class. 

# Histograms are used to summarize the distribution of data.
# This QuickStart sample creates a histogram from data 
# in a variety of ways.

# We use the test scores of students on a hypothetical national test.
# First we create a NumericalVariable that holds the test scores.
group1Data = Vector([ \
	62, 77, 61, 94, 75, 82, 86, 83, 64, 84, \
    68, 82, 72, 71, 85, 66, 61, 79, 81, 73 ])
group1Results = NumericalVariable("Class 1", group1Data)

# We can create a histogram with evenly spaced bins by specifying the lower bound, # the upper bound, and the number of bins:
histogram1 = Histogram(50, 100, 5)
			
# We can also provide the bounds explicitly:
bounds = Array[float]([50, 62, 74, 88, 100])
histogram2 = Histogram(bounds)

# Or we can first create a NumericalScale object
scale = NumericalScale(50, 100, 5)
histogram3 = Histogram(scale)

# To tally the results, we simply call the Tabulate method.
# The data can be supplied as a NumericalVariable:
histogram1.Tabulate(group1Results)
# or simply as a Double array:
histogram2.Tabulate(group1Data)

# You can add multiple data sets to the same histogram:
histogram2.Tabulate(Vector([74, 68, 89 ]))
# Or you can add individual data points using the Increment method.
# This will increment the count of the bin that contains 
# the specified value:
histogram2.Increment(83)
histogram2.Increment(78)

			
# The Clear method clears all the data:
histogram2.Clear()


# The Bins property returns a collection of HistogramBin objects:
bins = histogram1.Bins
# It has a Count property that returns the total number of bins:
print "# bins:", bins.Count
# and an indexer property that returns a HistogramBin object:
bin = bins[2]
# HistogramBin objects have a lower bound, an upper bound, and a value:
print "Bin 2 has lower bound", bin.LowerBound
print "Bin 2 has upper bound", bin.UpperBound
print "Bin 2 has value", bin.Value

# The histogram's FindBin method returns the Histogram bin
# that contains a specified value:
bin = histogram1.FindBin(83)
print "83 is in bin", bin.Index

# You can use the Bins property to iterate through all the bins
# in a foreach loop:
for bin2 in histogram1.Bins:
	print "Bin {0}: {1}".format(bin2.Index, bin2.Value)

# The histogram's GetTotals method returns a double array 
# that contains the total for each bin in the histogram:
totals = histogram1.GetTotals()