Skip to content

Fix critical bugs: overflow, thread safety, and correlation tracking#39

Merged
dexcompiler merged 2 commits intodemo/ds-protocols-engineeringfrom
copilot/sub-pr-38
Feb 16, 2026
Merged

Fix critical bugs: overflow, thread safety, and correlation tracking#39
dexcompiler merged 2 commits intodemo/ds-protocols-engineeringfrom
copilot/sub-pr-38

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 16, 2026

Addresses critical correctness and safety issues identified in code review of the distributed systems simulation.

VectorClock Deserialization Safety

Issue: ReadFrom() cast uint count to int without bounds checking, enabling integer overflow on malicious inputs.

// Before: vulnerable to overflow when count > int.MaxValue
var count = BinaryPrimitives.ReadUInt32BigEndian(source);
var countValue = (int)count;  // unchecked cast

// After: validate before cast
if (count > MaxEntries || count > int.MaxValue)
    throw new ArgumentOutOfRangeException(...);

Also added capacity check in Increment() to fail-fast before exceeding MaxEntries (65,536 nodes).

FailureInjector Thread Safety

System.Random is not thread-safe. Wrapped all Random calls with Lock to prevent state corruption:

public bool ShouldDrop()
{
    lock (_lock)
    {
        return _random.NextDouble() < DropRate;
    }
}

Distributed Tracing Fidelity

Retry logic was creating new CorrelationId per attempt, breaking trace chains. Added _correlationIds dictionary to preserve original correlation across retries:

// Preserve original correlation ID for distributed tracing
var correlationId = _correlationIds.TryGetValue(orderId, out var id) 
    ? id 
    : Guid.CreateVersion7();

Collection Iteration Safety

Poll() iterated _retries while TryRetry() modified it. Snapshot collection before iteration to avoid TOCTOU:

var snapshot = _retries.ToArray();
foreach (var (orderId, handle) in snapshot) { ... }

Documentation

  • Added comprehensive XML docs to DistributedAtLeastOnceCausalityShowcase.Run() with parameter descriptions
  • Documented deduplication behavior in VectorClock.ReadFrom()

Testing

Added boundary tests for VectorClock overflow conditions. All 182 tests pass (115 unit + 67 property).


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Add demo for at-least-once DS simulation with HLC/VC stats Code review: at-least-once DS simulation with HLC/VC stats Feb 16, 2026
Copilot AI requested a review from dexcompiler February 16, 2026 12:28
@dexcompiler dexcompiler marked this pull request as ready for review February 16, 2026 12:36
…y, correlation ID preservation

Co-authored-by: dexcompiler <115876036+dexcompiler@users.noreply.github.com>
Copilot AI changed the title Code review: at-least-once DS simulation with HLC/VC stats Fix critical bugs: overflow, thread safety, and correlation tracking Feb 16, 2026
@dexcompiler
Copy link
Copy Markdown
Owner

@copilot Since System.Random is not thread-safe, is the other alternative, RandomNumberGenerator, from the System.Security.Cryptography namespace thread-safe and a better option here?

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Feb 16, 2026

@copilot Since System.Random is not thread-safe, is the other alternative, RandomNumberGenerator, from the System.Security.Cryptography namespace thread-safe and a better option here?

Good question! RandomNumberGenerator is indeed thread-safe, but it's not suitable for this use case for two key reasons:

  1. Determinism requirement: This simulation takes a seed parameter for reproducible fault injection. RandomNumberGenerator doesn't support seeded deterministic generation - it's designed for cryptographic randomness.

  2. Performance: RandomNumberGenerator is significantly slower since it's cryptographically secure. For this simulation workload (generating probabilities for drop/dup/reorder decisions), the overhead isn't justified.

The Lock approach is correct here because:

  • The simulation is single-threaded (driven by SimulatedTimeProvider)
  • The lock protects against future multi-threaded usage
  • Preserves deterministic behavior via seeding
  • Minimal performance impact

If we needed concurrent random generation without locks, Random.Shared (thread-safe in .NET 6+) would be better than RandomNumberGenerator, but we'd lose seed-based reproducibility which is critical for testing distributed system scenarios.

The current implementation strikes the right balance: deterministic + thread-safe + performant.

@dexcompiler dexcompiler merged commit c7c8aa0 into demo/ds-protocols-engineering Feb 16, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants