Skip to content

Fix caching_and_replay example for Mesa 3.x compatibility#327

Open
Jayantparashar10 wants to merge 5 commits intomesa:mainfrom
Jayantparashar10:fix-caching-replay-mesa3
Open

Fix caching_and_replay example for Mesa 3.x compatibility#327
Jayantparashar10 wants to merge 5 commits intomesa:mainfrom
Jayantparashar10:fix-caching-replay-mesa3

Conversation

@Jayantparashar10
Copy link
Contributor

Summary

The caching_and_replay example was completely broken when run with Mesa 3.x, throwing multiple TypeErrors and AttributeErrors that prevented both recording and replay functionality. This PR fixes all Mesa 3.x compatibility issues, implementing custom serialization/deserialization logic to properly handle the new CellAgent architecture and internal agent management structures.

Bug / Issue

Fixes #289

The example was broken with Mesa 3.x due to fundamental changes in how Mesa handles agents, grids, and positioning:

What was expected: The example should record simulation states to a cache file and replay them deterministically.

What actually happened:

  • TypeError: 'AgentSet' object does not support item assignment
  • AttributeError: 'OrthogonalMooreGrid' object has no attribute 'get_cell'
  • Agent duplication during replay (114 agents instead of 57)
  • TypeError: AgentSet.__init__() missing required positional argument: 'agents'
  • Agents had pos = None (CellAgents don't use pos, they use cell.coordinate)
  • Visualization crashed with IndexError due to missing position data

Root causes:

  1. Mesa 3.x CellAgents use cell.coordinate for positioning, not pos (which is always None)
  2. Default mesa-replay serialization couldn't handle Mesa 3.x's internal agent tracking dictionaries
  3. Mesa 3.x uses multiple agent tracking structures (_agents, _agents_by_type, _all_agents) that weren't being cleared during deserialization
  4. Grid API changed from get_cell() to direct _cells dictionary access
  5. Visualization used deprecated mesa.visualization.ModularServer which was removed in Mesa 3.x in favor of Solara-based visualization

Implementation

Core Changes:

  1. Custom Serialization (_serialize_state)

    • Serialize agent positions using agent.cell.coordinate instead of agent.pos
    • Save grid state as coordinate → agent ID mapping
    • Properly serialize DataCollector state
    • Store random number generator state
    • Save agent ID counter for consistent restoration
  2. Custom Deserialization (_deserialize_state)

    • Clear all Mesa 3.x agent tracking dictionaries before restoration to prevent duplicates
    • Recreate agents with correct cell relationships via agent.cell = cell
    • Restore grid structure using Mesa 3.x's _cells dictionary
    • Properly restore DataCollector data without double-counting
    • Restore agent ID counter for consistent ID generation
  3. Visualization Migration

  • Old (deprecated): Used mesa.visualization.ModularServer with manual component setup:
    server = mesa.visualization.ModularServer(
        model_cls=CacheableSchelling,
        visualization_elements=[canvas_element, happy_chart, ...],
        name="Schelling Segregation Model",
        model_params=model_params,
    )
  • New (Mesa 3.x): Migrated to Solara-based SolaraViz with modern component factory pattern:
    space_component = make_space_component(agent_portrayal)
    happy_chart = make_plot_component("happy")
    
    Page = SolaraViz(
        model,
        components=[space_component, happy_chart, get_happy_agents, get_cache_file_status],
        model_params=model_params,
        name="Schelling Segregation Model (Cacheable)",
    )
  • Use AgentPortrayalStyle for consistent agent rendering
  • Added cache status display showing mode and file size
  • Simplified UI with clear record/replay controls
  1. Error Handling
    • Added try-except for IndexError when grid is full (agent can't move)
    • Handle missing cache files gracefully with fallback to record mode
    • Clear verbose logging for debugging (optional verbose parameter)

Files Modified:

  • cacheablemodel.py - Complete rewrite with Mesa 3.x-compatible serialization
  • run.py - Updated visualization with replay controls and cache status
  • server.py - Mesa 3.x visualization components
  • model.py - Added error handling for full grid
  • README.md - Updated documentation with clearer instructions

Testing

Comprehensive testing performed:

Cache Recording

  • Successfully records simulations with all agent states preserved
  • Cache file creation verified (proper compression and file size)
  • Multiple runs with different parameters tested

Replay Determinism

  • Agent counts match exactly (57 agents → 57 agents)
  • Agent positions match 100% across all steps
  • Happy agent counts match datacollector records
  • Grid state preserved accurately
  • Random number generation consistent

Mesa 3.x Compliance

  • CellAgent positioning works correctly (using cell.coordinate)
  • Grid API uses proper _cells dictionary access
  • Agent tracking dictionaries properly managed
  • AgentSet initialization follows Mesa 3.x patterns
  • DataCollector integration verified

Visualization

  • Grid rendering works without errors
  • Agent colors display correctly (red/blue for types)
  • Charts update properly during replay
  • Cache status display shows correct information
  • UI controls function as expected

Edge Cases

  • Full grid scenario handled (agents can't move)
  • Missing cache file handled gracefully
  • Custom cache file paths work
  • Manual simulation stop tested
  • Multiple record/replay cycles verified

Test Results:

All agent counts match: Pass
All positions match: Pass
All happy counts match: Pass
DataCollector synchronized: Pass
Replay determinism: 100%

No Regressions:

  • Verified that base Schelling model tests pass
  • Confirmed other mesa-examples tests unaffected
  • Test suite shows no new failures related to these changes

Additional Notes

Technical Decisions:

  1. Why custom serialization? Mesa 3.x's internal structures (WeakKeyDictionary for _agents, CellAgent architecture) don't serialize well with default dill serialization. Custom implementation ensures reliability.

  2. Agent dict clearing: Mesa 3.x maintains multiple agent tracking dictionaries. All must be cleared before deserialization to prevent agent duplication.

  3. CellAgent positioning: Mesa 3.x CellAgents have pos = None by design. They use cell.coordinate instead, which is what we serialize.

  4. Incremental cache writing: Cache is written after each step in RECORD mode for persistence, though full write happens on simulation completion.

Compatibility:

  • Tested with Mesa 3.x (latest)
  • Python 3.10+ required (for mesa-replay)
  • No breaking changes to mesa-replay API
  • Follows Mesa example standards

Future Improvements:

  • Could add cache_step_rate parameter to UI for large simulations
  • Could add cache compression options
  • Could add replay speed controls

Documentation:

  • README simplified with clear record/replay instructions
  • Docstrings follow Mesa standards
  • No verbose technical explanations

Fixes mesa#289

- Implement custom serialization/deserialization for CacheableModel to handle Mesa 3.x CellAgent positioning
- Fix agent restoration by properly clearing all agent tracking dictionaries
- Use cell.coordinate instead of pos for CellAgent position serialization
- Update visualization to use Mesa 3.x patterns (make_space_component, make_plot_component)
- Add cache status display to UI
- Handle IndexError when no empty cells available for agent movement
- Ensure deterministic replay of model states
- Simplify docstrings and comments to be more natural and concise
- Follow Mesa documentation standards
- Make error messages more user-friendly
- Streamline README with clearer, less verbose instructions
- Remove overly technical explanations
Copilot AI review requested due to automatic review settings February 12, 2026 12:29
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the caching_and_replay example to work on Mesa 3.x by replacing deprecated visualization APIs and adding custom cache (de)serialization compatible with CellAgents and Mesa’s new internal agent tracking.

Changes:

  • Migrates the example UI from ModularServer to Solara (SolaraViz, make_space_component, make_plot_component) and adds cache/replay status UI.
  • Reworks caching logic with custom _serialize_state / _deserialize_state to store CellAgent positions via cell.coordinate and avoid Mesa 3.x serialization pitfalls.
  • Adds small robustness improvements (e.g., handling “full grid” moves) and updates README instructions.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
examples/caching_and_replay/cacheablemodel.py Implements custom serialization/deserialization and record/replay orchestration for Mesa 3.x.
examples/caching_and_replay/run.py New SolaraViz-based app for the cacheable model + cache status UI.
examples/caching_and_replay/server.py Migrates the non-cacheable visualization components to Solara.
examples/caching_and_replay/model.py Adjusts agent movement to handle “no empty cell” cases.
examples/caching_and_replay/README.md Updates run instructions and usage notes for the Solara-based app.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Jayantparashar10 and others added 2 commits February 12, 2026 19:28
- Combine nested if statements for better readability
- Add noqa comment for pickle usage in cache serialization
- Use contextlib.suppress for cleaner exception handling
- Remove unnecessary imports and optimize code organization
@Jayantparashar10
Copy link
Contributor Author

Hi @EwoutH , could you please take a look at this PR when you have a chance?
It fixes the caching_and_replay example for Mesa 3.x compatibility (fixes #289).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

caching_and_replay example is broken/needs to be updated to work with Mesa 3.0

2 participants