Parallel Randomness and Independent Streams

Parallel and concurrent code requires independent random number streams to ensure correctness and reproducibility. Numerics.NET provides built-in mechanisms for creating independent streams through mixing-based hierarchical derivation and jump-based partitioning.

The Thread Safety Problem

Most random number generators are not thread-safe. Sharing a single RNG instance across threads without synchronization leads to race conditions and incorrect results:

C#
// BAD: Sharing RNG across threads (race condition!)
var rng = RandomSources.Create(42);

Parallel.For(0, 1000, i =>
{
    // UNSAFE: Multiple threads calling NextDouble() on same RNG
    double value = rng.NextDouble();
});

The solution is to give each thread its own independent RNG instance. Numerics.NET provides several approaches, each with different tradeoffs.

Quick Solution: Use Shared for Convenience

The simplest solution for non-reproducible parallel code is to use Shared, which is thread-safe and automatically provides independent generators per thread:

C#
// Use RandomSources.Shared for thread-safe convenience randomness
Parallel.For(0, 1000, i =>
{
    // Safe: Each thread gets its own generator from thread-local storage
    double value = RandomSources.Shared.NextDouble();
});

This is the easiest approach and works well for exploratory work or scenarios where reproducibility is not required. However, Shared will produce different results across runs.

For reproducible parallel work, the simplest correct approach is CreateStreams(). Call it once before your parallel loop, passing the number of workers and a base seed. Each worker receives a stable, independent stream identified by its integer index:

C#
// Create 10 independent streams with a fixed base seed
var streams = RandomSources.CreateStreams(count: 10, seed: 42);

// Process data in parallel - each worker uses its own independent stream
Parallel.For(0, 10, i =>
{
    var myRng = streams[i];
    var randomValue = myRng.NextDouble();
});

Numerics.NET guarantees that:

  • The same baseSeed always produces the same set of streams. Results are fully reproducible across runs and machines.

  • Each stream index maps to a distinct, statistically independent stream. Worker 0 always gets stream 0, worker 1 always gets stream 1, and so on. There is no race condition.

  • The returned array contains Xoshiro256StarStar instances, a fast, high-quality default generator.

  Note

Internally, CreateStreams() uses the stream-tree mechanism described in the Advanced Patterns section below. You do not need to understand stream trees to use CreateStreams() correctly.

For reproducible parallel code, Numerics.NET provides two primary mechanisms for creating independent streams:

  • Stream Trees (mixing-based): Use RandomStreamTree<TRandom> to create RNGs from a seed and a hierarchical stream address. This produces statistically independent streams with very high probability and works with any RNG. Stream addresses can be simple (e.g., [0], [1], [2]) or hierarchical (e.g., [3, 2, 5]).

  • Stream Partitions (jump-based): Use RandomStreamPartition<TRandom> to create RNGs by jumping ahead by huge steps (e.g., 2128) in a sequence with an even longer period. This provides mathematical guarantees of independence but is only available for RNGs that implement IJumpable.

Both approaches support batch creation (recommended) and manual creation. Here's the recommended pattern using stream trees:

C#
// Bulk create using static helper method
var streamOptions = new RandomOptions(seed: 42);
var parallelStreams = Pcg64.CreateStreamTree(in streamOptions).NextStreams(100);

// Use in parallel loop
Parallel.For(0, parallelStreams.Length, i =>
{
    var myRng = parallelStreams[i];
    var result = ComputeWithRandomness(myRng);
});

double ComputeWithRandomness(IRandomSource rng)
{
    var samples = new double[1000];
    rng.Fill(samples);
    return samples.Mean();
}

This approach offers several advantages:

  • Fully reproducible: Same seed produces the same results across runs.

  • Mathematically independent: Streams are constructed using mixing-based derivation to ensure statistical independence.

  • Simple to use: Create a tree and call NextStreams to get all streams you need.

  • Efficient: O(1) stream creation for mixing-based generators like Pcg64.

  Note

The RandomStreamTree<TRandom> uses mixing-based stream derivation internally, similar to NumPy's spawn(). Each stream is derived from the base seed material combined with its hierarchical path, ensuring statistical independence.

Hierarchical Streams with RandomStreamTree

RandomStreamTree<TRandom> provides hierarchical stream derivation where you can create streams at different levels using NextStreams or Branch. Here's the basic pattern for creating worker-level streams:

C#
// Create a hierarchical stream tree for nested parallel work
var options = new RandomOptions(seed: 42);
var tree = new RandomStreamTree<Pcg64>(in options);

// Generate worker-level streams using NextStreams
var workers = tree.NextStreams(4);

// Each worker can create its own sub-streams
Parallel.For(0, workers.Length, workerIndex =>
{
    var workerRng = workers[workerIndex];

    // Worker uses its stream for computation
    var values = new double[100];
    workerRng.Fill(values);
});

RandomStreamTree<TRandom> supports unlimited hierarchy depth and provides O(1) advancement within a level. Each stream is identified by its hierarchical path (e.g., [0], [1, 2], [3, 1, 5]), ensuring deterministic stream assignment.

Multi-Level Parallel Hierarchies

For applications with multiple levels of parallelism (e.g., processes and threads, or workers and tasks), use nested branching to create a stable hierarchy:

C#
// Create a two-level hierarchy: processes and threads
var rootOptions = new RandomOptions(seed: 42);
var processTree = new RandomStreamTree<Pcg64>(in rootOptions);

// Simulate 3 processes, each with its own stream tree
Parallel.For(0, 3, processId =>
{
    // Branch to create this process's stream tree
    var processSubTree = processTree.Branch((ulong)processId);

    // Each process spawns 5 threads
    Parallel.For(0, 5, threadId =>
    {
        // Get this thread's stream from the process tree
        var threadRng = processSubTree.Branch((ulong)threadId).PrefixStream();

        // Use the stream for computation
        var samples = new double[100];
        threadRng.Fill(samples);
    });
});

This pattern ensures that adding or removing work at one level does not affect the streams used at other levels, providing robust reproducibility even as your parallel structure evolves.

NumPy Compatibility

When using the Numpy seed profile, streams with the same spawn_key sequence will produce bit-identical output to NumPy's spawn() for compatible RNGs.

C#
// Matches: np.random.SeedSequence(42).spawn(10)[3]
var tree = new RandomStreamTree<Pcg64>(
    new RandomOptions(42, streamAddress: StreamAddress.Empty, seedProfile: SeedProfile.Numpy));
tree.Advance(3);
var rng = tree.NextStream();

Jump-Based Streams with RandomStreamPartition

For RNGs that implement IJumpable (like Xoshiro256StarStar), you can use RandomStreamPartition<TRandom> to create streams by jumping forward in the sequence. The key difference from tree-based streams is that partitions guarantee mathematical independence by dividing a single long sequence into non-overlapping segments:

C#
// Jump-based partition (for RNGs with IJumpable support)
var partitionOptions = new RandomOptions(seed: 42);
var partition = new RandomStreamPartition<Xoshiro256StarStar>(
    in partitionOptions, 
    JumpStride.Jump);

// Generate streams using jumps (O(N) cost)
var jumpStreams = partition.NextStreams(10);

// Use in parallel - each stream is 2^128 steps apart
Parallel.For(0, jumpStreams.Length, i =>
{
    var myRng = jumpStreams[i];
    var values = new double[1000];
    myRng.Fill(values);
});

Only jumpable RNGs support this approach. The JumpStride parameter controls the spacing:

  • Jump - Standard jump (e.g., 2128 for Xoshiro256)

  • LongJump - Extended jump (e.g., 2192 for Xoshiro256)

  Important

Jump-based stream creation has O(N) time complexity for N streams, as each stream requires executing a jump operation. For applications requiring many streams (thousands or more), prefer mixing-based generators like Pcg64 or Philox4x64 which support O(1) stream creation.

Manual Stream Selection

For fine-grained control, you can create streams manually by specifying stream addresses directly in constructors. This is useful when stream addresses need to be computed dynamically or in distributed systems where each node needs a specific stream addresses:

C#
// Create generators for different streams
var stream0 = new Pcg64(seed: 12345, streamAddress: 0);
var stream1 = new Pcg64(seed: 12345, streamAddress: 1);
var stream2 = new Pcg64(seed: 12345, streamAddress: 2);

Stream addresses can be single values or sequences of values. Unlike Numpy, you can freely assign arbitrary stream addresses without needing to follow a specific hierarchy or structure.

Manual stream selection works with all generators using most seed profiles. Some RNGs, like Pcg64, also support stream selection using a proprietary scheme defined for that RNG with the Standard seed profile.

Alternative Approaches

While the built-in stream APIs are recommended, the following approaches are also available:

Using Jump Methods Directly

Generators that implement IJumpable provide Jumped() and LongJumped() methods for manual stream creation:

C#
var baseRng = new Xoshiro256StarStar(12345);
var jumpedStreams = new Xoshiro256StarStar[10];

// Create independent streams by jumping
jumpedStreams[0] = baseRng;
for (int i = 1; i < jumpedStreams.Length; i++)
{
    jumpedStreams[i] = jumpedStreams[i - 1].Jumped();
}

// Use in parallel
Parallel.For(0, 10, i =>
{
    var myRng = jumpedStreams[i];
    var values = new double[1000];
    myRng.Fill(values);
});

This approach requires sequential creation (each jump depends on the previous state) and is best suited for small numbers of streams. For larger numbers, use RandomStreamPartition<TRandom> instead.

Using Different Seeds (Not Recommended)

You can create independent RNGs by using different seeds. While simple, this approach offers weaker independence guarantees:

C#
// Simple approach: use different seeds per worker
var rngs = new IRandomSource[10];
for (int i = 0; i < rngs.Length; i++)
{
    rngs[i] = RandomSources.Create(12345 + i);
}

Parallel.For(0, 10, i =>
{
    var myRng = rngs[i];
    var values = new double[1000];
    myRng.Fill(values);
});

This approach is not recommended for critical simulations because related seeds may produce correlated output in some RNGs. Prefer mixing-based or jump-based approaches when possible.

Using Task Parallel Library (TPL)

When using Parallel.For or Parallel.ForEach, the recommended pattern is to pre-create streams and distribute them to workers:

C#
// Bulk create using static helper method
var streamOptions = new RandomOptions(seed: 42);
var parallelStreams = Pcg64.CreateStreamTree(in streamOptions).NextStreams(100);

// Use in parallel loop
Parallel.For(0, parallelStreams.Length, i =>
{
    var myRng = parallelStreams[i];
    var result = ComputeWithRandomness(myRng);
});

double ComputeWithRandomness(IRandomSource rng)
{
    var samples = new double[1000];
    rng.Fill(samples);
    return samples.Mean();
}

Alternatively, for scenarios requiring thread-local initialization, use the thread-local overload of Parallel.For:

C#
double sum = 0.0;
object lockObj = new object();

Parallel.For(
    fromInclusive: 0,
    toExclusive: 10,
    localInit: () => new Pcg64(seed: 12345, streamAddress: (ulong)Environment.CurrentManagedThreadId),
    body: (i, state, localRng) =>
    {
        // Each iteration uses its thread-local RNG
        double localSum = 0.0;
        for (int j = 0; j < 1000; j++)
            localSum += localRng.NextDouble();
        return localRng;
    },
    localFinally: localRng =>
    {
        // Aggregate results (optional)
        lock (lockObj)
        {
            sum += 0.0; // Placeholder for actual computation
        }
    });

Choosing the Right Approach

Use this decision tree to select the right approach for your application:

  1. Need reproducibility? If no, use Shared. It's thread-safe and simple.

  2. Simple parallel loop with fixed number of workers? Use CreateStreams() with NextStreams to bulk-create independent streams. This is the recommended approach for most applications.

  3. Need nested parallelism or complex hierarchy? Use RandomStreamTree<TRandom> for hierarchical stream derivation with unlimited depth using Branch and NextStream.

  4. Using a jumpable RNG like Xoshiro? Use RandomStreamPartition<TRandom> for jump-based partitioning, or use RandomStreamTree<TRandom> which uses mixing-based derivation.

  5. Need dynamic stream ID assignment? Use manual stream selection with constructor stream ID parameters.

Reproducibility Guarantees

The stream APIs provide the following reproducibility guarantees:

  • Same seed, same stream address → same sequence: Creating an RNG with the same seed and stream address always produces identical output within a library version.

  • Stable stream assignment: If you use the same stream creation pattern (same number of streams, same branching structure), each stream will get the same sequence across runs.

  • Order independence: With proper stream assignment, parallel execution order does not affect results. Each stream produces its own independent sequence regardless of when it's consumed.

  Note

For strict bit-level reproducibility across library versions, pin your library version and use explicit stream addresses. Mixing-based derivation algorithms may evolve in major versions to incorporate improvements.

Performance Considerations

Follow these guidelines for best performance in parallel code:

  • Avoid synchronization: Never share a single RNG across threads. The synchronization overhead typically outweighs the cost of creating independent RNGs.

  • Prefer bulk stream creation: Use RandomStreamTree<TRandom> with NextStreams to create all streams at once rather than creating them individually.

  • Choose the right generator: For massively parallel workloads (thousands+ of streams), prefer counter-based generators like Philox4x64 which support O(1) stream creation and positioning.

  • Minimize RNG creation: Create one RNG per thread/task, not per operation. Store them in thread-local storage or task-local state.

Next Steps

To learn more about parallel randomness:

See Also