Skip to content

Conversation

@BenjaminCharmes
Copy link
Contributor

@BenjaminCharmes BenjaminCharmes commented Dec 12, 2025

Closes #1469

Linked to Add prefers_async flag to insitu blocks #76

Summary

Implements asynchronous data block processing using APScheduler for computationally intensive blocks, with a unified task management system for all async operations.

Changes

Backend

Task Management Refactoring:

  • Unified Task model replacing separate BlockTask and ExportTask models
  • Single tasks MongoDB collection with type field to distinguish export vs block processing tasks
  • Consolidated status enums - TaskStatus replaces ExportStatus and BlockProcessingStatus

Async Block Processing:

  • Added Task model to track async processing status (PENDING/PROCESSING/READY/ERROR)
  • Added TaskStage model to log timestamped processing stages with severity levels (info/warning/error)
  • Modified update_block() to support async processing via prefers_async attribute
  • Added /blocks/<task_id>/status endpoint for status polling, including stage history
  • Reused existing export_scheduler infrastructure for background job execution
  • Added AsyncCanaryBlock for testing async processing workflow

Frontend

  • Added pollBlockStatus() to poll task status every 2 seconds
  • Modified updateBlockFromServer() to handle async responses
  • Added real-time progress display showing current processing stage in UI
  • Updated insitu block components (NMR & UV-Vis) to trigger async processing only when all required data is present

Implementation Details

  • Blocks annotate with prefers_async = True to enable async processing
  • UI sends trigger_async: true in event_data when ready to process
  • Processing stages logged at each step (starting, loading, processing, generating viz, saving, completion)
  • bokeh_plot_data is regenerated on-demand via block.to_web() in status endpoint
  • Only creates async tasks when all folders/parameters are selected
  • Stage information displayed as blue info alert during processing, auto-cleared on completion
  • Task type filtering ensures export and block processing tasks remain isolated despite sharing infrastructure

BenjaminCharmes and others added 16 commits October 28, 2025 10:30
First attempt export as ELN
Use APScheduler

Use APScheduler

Use APScheduler

Use APScheduler

Use APScheduler

Use APScheduler
Co-authored-by: Matthew Evans <git@ml-evs.science>
Co-authored-by: Matthew Evans <git@ml-evs.science>
Co-authored-by: Diana Aliabieva <dianaaliabieva@Dianas-MacBook-Pro.local>
First attempt to sample ELN export button
Add itemGraph in ExportModal and depth control for related items

Add itemGraph in ExportModal and depth control for related items

Use slider to control itemGraph depth

Add cypress component tests

Add cypress component tests

Add cypress component tests

Add cypress component tests

Add cypress component tests

Add cypress component tests
Add asynchronous data block processing with APScheduler

Add asynchronous data block processing with APScheduler
@codecov
Copy link

codecov bot commented Dec 12, 2025

Codecov Report

❌ Patch coverage is 59.18919% with 151 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.77%. Comparing base (22f70a4) to head (c1af4df).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pydatalab/src/pydatalab/routes/v0_1/blocks.py 22.61% 65 Missing ⚠️
pydatalab/src/pydatalab/routes/v0_1/export.py 56.00% 44 Missing ⚠️
pydatalab/src/pydatalab/export_utils.py 76.13% 21 Missing ⚠️
pydatalab/src/pydatalab/scheduler.py 44.11% 19 Missing ⚠️
pydatalab/src/pydatalab/apps/__init__.py 50.00% 1 Missing ⚠️
pydatalab/src/pydatalab/routes/v0_1/graphs.py 95.45% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1491      +/-   ##
==========================================
- Coverage   79.13%   77.77%   -1.36%     
==========================================
  Files          70       74       +4     
  Lines        5219     5567     +348     
==========================================
+ Hits         4130     4330     +200     
- Misses       1089     1237     +148     
Files with missing lines Coverage Δ
pydatalab/src/pydatalab/apps/_canary/__init__.py 100.00% <100.00%> (ø)
pydatalab/src/pydatalab/models/tasks.py 100.00% <100.00%> (ø)
pydatalab/src/pydatalab/mongo.py 82.27% <100.00%> (+1.45%) ⬆️
pydatalab/src/pydatalab/routes/v0_1/__init__.py 100.00% <100.00%> (ø)
pydatalab/src/pydatalab/apps/__init__.py 70.66% <50.00%> (-0.96%) ⬇️
pydatalab/src/pydatalab/routes/v0_1/graphs.py 97.70% <95.45%> (-1.01%) ⬇️
pydatalab/src/pydatalab/scheduler.py 44.11% <44.11%> (ø)
pydatalab/src/pydatalab/export_utils.py 76.13% <76.13%> (ø)
pydatalab/src/pydatalab/routes/v0_1/export.py 56.00% <56.00%> (ø)
pydatalab/src/pydatalab/routes/v0_1/blocks.py 43.63% <22.61%> (-21.92%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cypress
Copy link

cypress bot commented Dec 12, 2025

datalab    Run #4342

Run Properties:  status check passed Passed #4342  •  git commit ebcaf22d05 ℹ️: Merge c1af4dfd3df9caa8e8baf623e8bf2c8ab553584f into 724368141faf51559b14f8a4421c...
Project datalab
Branch Review bc/async-block
Run status status check passed Passed #4342
Run duration 08m 48s
Commit git commit ebcaf22d05 ℹ️: Merge c1af4dfd3df9caa8e8baf623e8bf2c8ab553584f into 724368141faf51559b14f8a4421c...
Committer Ben Charmes
View all properties for this run ↗︎

Test results
Tests that failed  Failures 0
Tests that were flaky  Flaky 0
Tests that did not run due to a developer annotating a test with .skip  Pending 0
Tests that did not run due to a failure in a mocha hook  Skipped 0
Tests that passed  Passing 458
View all changes introduced in this branch ↗︎

Copy link
Member

@ml-evs ml-evs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excited to try this @BenjaminCharmes! Let me know when you think its ready to play around with! I would suggest also extending the CanaryBlock to make an async version we can use in testing.

@BenjaminCharmes BenjaminCharmes marked this pull request as ready for review December 15, 2025 16:50
@@ -0,0 +1,43 @@
from datetime import datetime, timezone
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we combine the block tasks and export task models and MongoDB collection?

@ml-evs ml-evs added this to the v0.7.x milestone Dec 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Asynchronous data block processing

4 participants