feat: triggers suspended mode #7297

hugocasa · 2025-12-04T13:37:20Z

No description provided.

…mode

…-mode

cloudflare-workers-and-pages · 2025-12-04T13:44:34Z

Deploying windmill with Cloudflare Pages

Latest commit:	`d432b5a`
Status:	🚫 Build failed.

View logs

…-mode

claude · 2025-12-05T14:39:04Z

Claude finished @hugocasa's task —— View job

PR Review: Triggers Suspended Mode ✅

I've completed a comprehensive review of this PR. This is a substantial and well-architected feature that adds a "suspended mode" to triggers, allowing jobs to be queued without automatic execution. Here's my detailed analysis:

🎯 Overview

This PR introduces a third state for triggers beyond enabled/disabled: suspended mode. When a trigger is suspended, it continues accepting payloads and creating jobs, but those jobs are marked with special "unassigned" job kinds (UnassignedScript, UnassignedFlow, UnassignedSinglestepFlow) and won't execute automatically. Users can then review and either resume or cancel these jobs.

Scope: ~5,200 additions, ~2,300 deletions across 100 files
Impact: All trigger types (HTTP, Websocket, Kafka, NATS, MQTT, SQS, GCP, Postgres, Email)

✅ Strengths

1. Clean Database Migration

The migration from boolean enabled to TRIGGER_MODE enum is well-structured:

Creates enum type correctly
Migrates data safely (CASE WHEN enabled = true THEN 'enabled' ELSE 'disabled')
Includes proper down migration for rollback
Covers all 9 trigger tables consistently

2. Good Abstraction with TriggerCrud Trait

The TriggerCrud trait provides excellent abstraction:

// backend/windmill-api/src/triggers/handler.rs:209-276
async fn set_trigger_mode(&self, ...) -> Result<bool>

Centralizes mode toggling logic
Handles both server-state and non-server-state triggers
Includes extensibility hook via set_trigger_mode_extra_action

3. Backward Compatibility

The deprecated enabled field handling shows thoughtful backward compatibility:

// backend/windmill-api/src/triggers/mod.rs:126-142
pub fn mode(&self) -> &TriggerMode {
    self.mode.as_ref().unwrap_or(
        #[allow(deprecated)]
        if self.enabled.unwrap_or(true) { &TriggerMode::Enabled }
        else { &TriggerMode::Disabled }
    )
}

4. Smart Job Reassignment Logic

The resume_suspended_trigger_jobs function intelligently handles two scenarios (backend/windmill-api/src/triggers/global_handler.rs:169-255):

Jobs created AFTER trigger edit: Delete and recreate with new config
Jobs created BEFORE trigger edit: Simple update to unsuspend
This prevents stale configuration from being executed.

5. Listener Architecture

The listener system correctly fetches both enabled AND suspended triggers:

// backend/windmill-api/src/triggers/listener.rs:81
.and_where("(mode = 'enabled'::TRIGGER_MODE OR mode = 'suspended'::TRIGGER_MODE)")

6. OpenAPI Updated

Good practice - the OpenAPI spec was properly updated with new endpoints and the TriggerMode enum.

🐛 Issues Found

CRITICAL: Potential Data Loss in Down Migration

Location: backend/migrations/20251127104100_add_suspended_mode_to_triggers.down.sql:6

Issue: The down migration converts suspended triggers to disabled, potentially losing important state. See inline comment for details and recommendations.

Impact: If users roll back while triggers are in suspended mode with queued jobs:

Suspended triggers become disabled
Unassigned jobs remain in limbo (no cleanup)
Users lose track of which triggers were suspended

HIGH: Missing Error Context in Transaction

Location: backend/windmill-api/src/triggers/global_handler.rs:228-254

Issue: Multiple DELETE operations in job cleanup lack error context. See inline comment for recommendations.

Impact: If a delete fails midway, debugging will be difficult without proper error messages.

MEDIUM: SQL Injection Defense

Locations:

backend/windmill-api/src/triggers/global_handler.rs:51
Similar patterns in handler.rs and listener.rs

Issue: Table names constructed from enum values. While currently safe, defense-in-depth would improve security. See inline comment for explicit validation pattern.

MEDIUM: Inconsistent Error Reporting

Location: backend/windmill-api/src/triggers/listener.rs:456-465

Issue: The disable_with_error function has inconsistent error reporting between trigger mode (uses report_critical_error) and capture mode (uses tracing::error!).

Impact: Critical errors in capture mode might not alert operators properly.

Recommendation: Unify error reporting to use report_critical_error for both paths.

LOW: Potential Race Condition in Mode Toggle

Location: backend/windmill-api/src/triggers/handler.rs:213-276

Issue: set_trigger_mode updates trigger mode and then calls set_trigger_mode_extra_action, but there's a window where the mode is updated but extra action hasn't run yet.

Impact: If a listener picks up the trigger between these two operations, it might process with inconsistent state.

Recommendation:

Consider making set_trigger_mode_extra_action run BEFORE the mode update
OR wrap both in a CTE/single query if possible
Document the ordering requirement

LOW: Missing Index Consideration

Location: Database migration

Issue: Queries frequently filter by mode but no indexes added. See inline comment on listener.rs:81 for recommendation.

🎨 Code Quality Observations

Good Practices

✅ Explicit column selection in queries (avoiding SELECT *)
✅ Using prepared statements with bind parameters
✅ Consistent error types via Result<T, Error>
✅ Comprehensive logging with tracing::
✅ Feature flags for enterprise/optional features

Minor Improvements

Verbose logging in production: Consider changing tracing::info!("SQL: {}", sql) to debug! level
Magic number (listener.rs:194):

tokio::time::sleep(tokio::time::Duration::from_secs(5)).await;

Extract to constant: const PING_INTERVAL_SECS: u64 = 5;

Duplicate code: The pattern of checking trigger mode appears in multiple places - consider a helper constant

🔒 Security Considerations

Access Control: ✅ Properly checks scopes before mode changes
Job Cancellation: ✅ Uses existing cancel_jobs which includes proper authorization
SQL Injection: ⚠️ Addressed in inline comments - enum-based table names are mostly safe but could be more defensive

🚀 Performance Considerations

Batch Operations: The resume_suspended_trigger_jobs processes jobs one at a time. For workspaces with hundreds of suspended jobs, consider batch updates.
Lock Contention: Line 645-668 in listener.rs uses row-level locks. With many triggers, this could cause contention.

📝 Documentation & Testing

Needs Documentation

Migration guide for users: How to handle existing enabled/disabled triggers
API documentation for new endpoints (beyond OpenAPI)
Behavioral documentation: What happens to suspended jobs when trigger is deleted?

Test Coverage Recommendations:

Suspended → Enabled with queued jobs
Suspended → Disabled with queued jobs
Trigger edited while jobs suspended (reassignment logic)
Concurrent mode changes
Race conditions in listener startup
Down migration with suspended triggers in database

📋 Testing Instructions

To manually test this feature through the app interface:

Navigate to any trigger page (e.g., HTTP Routes under Triggers section). Create or select an existing trigger. In the trigger editor toolbar, you'll see a toggle for enable/disable along with a dropdown menu. Click the dropdown and select "Suspend job execution" to put the trigger in suspended mode. The toggle will change to show three states: Disabled, Suspended, and Enabled.

Once suspended, trigger the endpoint/event that would normally execute the trigger. Go to the Runs page and add a filter for job status. You should see jobs with "Unassigned" status appearing - these are suspended jobs that won't execute automatically. Click on any suspended job to view its details. Notice the job shows as queued but won't start execution.

Back on the trigger page, if there are suspended jobs, you'll see an alert/modal appear when trying to change the trigger mode. The modal allows you to review all suspended jobs and choose to either resume them (which will execute them with the current trigger configuration) or cancel them. You can select individual jobs or batch process all suspended jobs. After resuming, jobs should transition to normal execution. After canceling, jobs should be marked as canceled in the runs page.

Test edge cases: edit the trigger configuration while jobs are suspended, then resume - jobs should use the NEW configuration. Delete a trigger with suspended jobs - verify appropriate cleanup. Switch between suspended and enabled/disabled modes multiple times to ensure state transitions work correctly.

Summary

This is a well-designed and thoughtfully implemented feature with good architecture and backward compatibility. The main concerns are:

Critical: Down migration data loss (recommend fixing before merge)
High: Transaction cleanup could be more robust
Medium: SQL injection defense could be stronger
Medium: Inconsistent error reporting between trigger/capture modes
Low: Minor race conditions and performance optimizations

The code follows Windmill's patterns well, maintains consistency across trigger types, and provides a valuable feature for users who need manual review before job execution.

Recommendation: Fix the critical down migration issue and high-priority items before merge. Medium and low priority items can be addressed in follow-up PRs if needed.

Inline comments added: 5 specific code locations with detailed recommendations

claude · 2025-12-05T14:43:54Z

backend/migrations/20251127104100_add_suspended_mode_to_triggers.down.sql

+ALTER TABLE email_trigger
+ADD COLUMN enabled BOOLEAN;
+
+UPDATE email_trigger SET enabled = CASE WHEN mode = 'enabled'::TRIGGER_MODE THEN true ELSE false END;


Critical Issue: Potential Data Loss

When rolling back, triggers in suspended mode will be converted to disabled, but any unassigned jobs will remain orphaned in the database. This could lead to:

Jobs stuck in limbo that users can't see or manage

Loss of user intent (which triggers were suspended vs disabled)

Recommendation: Add cleanup logic before the migration:

-- Clean up unassigned jobs before reverting mode column -- Option A: Cancel all unassigned jobs DELETE FROM v2_job WHERE kind IN ('unassigned_script', 'unassigned_flow', 'unassigned_singlestepflow'); -- OR Option B: Auto-execute them by converting to regular jobs UPDATE v2_job SET kind = CASE WHEN kind = 'unassigned_script' THEN 'script'::JOB_KIND WHEN kind = 'unassigned_flow' THEN 'flow'::JOB_KIND WHEN kind = 'unassigned_singlestepflow' THEN 'singlescriptflow'::JOB_KIND ELSE kind END WHERE kind IN ('unassigned_script', 'unassigned_flow', 'unassigned_singlestepflow');

Document which option is chosen and why.

claude · 2025-12-05T14:44:03Z

backend/windmill-api/src/triggers/global_handler.rs

+        _ => {}
+    }
+
+    let table_name = format!("{}_trigger", trigger_kind.to_string());


Medium Priority: SQL Injection Defense

While trigger_kind is an enum (which provides safety), constructing table names dynamically could become a vulnerability if the enum handling ever changes.

Recommendation: Add explicit validation:

let table_name = match trigger_kind { JobTriggerKind::Http => "http_trigger", JobTriggerKind::Kafka => "kafka_trigger", JobTriggerKind::Websocket => "websocket_trigger", JobTriggerKind::Mqtt => "mqtt_trigger", JobTriggerKind::Nats => "nats_trigger", JobTriggerKind::Postgres => "postgres_trigger", JobTriggerKind::Sqs => "sqs_trigger", JobTriggerKind::Gcp => "gcp_trigger", JobTriggerKind::Email => "email_trigger", JobTriggerKind::Webhook | JobTriggerKind::Schedule | _ => { return Err(Error::BadRequest(format!( "{} triggers do not support this operation", trigger_kind ))); } };

This provides defense-in-depth and makes supported trigger types explicit.

claude · 2025-12-05T14:44:11Z

backend/windmill-api/src/triggers/global_handler.rs

+            };
+
+            // Delete the unassigned job from all related tables
+            sqlx::query!("DELETE FROM v2_job_queue WHERE id = $1", job.id)


High Priority: Improve Error Context

These cascading deletes cleanup orphaned data after job reassignment. While they're in a transaction, consider adding error context for better debugging:

sqlx::query!("DELETE FROM v2_job_queue WHERE id = $1", job.id) .execute(&mut *tx) .await .context("Failed to delete job from queue during reassignment")?; sqlx::query!("DELETE FROM v2_job_runtime WHERE id = $1", job.id) .execute(&mut *tx) .await .context("Failed to delete job runtime during reassignment")?; // ... etc

Also consider if cascade deletes via foreign keys could simplify this (delete only from v2_job and let cascade handle the rest).

claude · 2025-12-05T14:44:34Z

backend/windmill-api/src/triggers/mod.rs

+}
+
+impl BaseTriggerData {
+    pub fn mode(&self) -> &TriggerMode {


Good Practice: Backward Compatibility

Nice handling of the deprecated enabled field! This ensures older API clients continue to work during the transition period. Consider documenting when this can be removed (e.g., "Remove in v2.0" or after a certain number of releases).

claude · 2025-12-05T14:44:43Z

backend/windmill-api/src/triggers/global_handler.rs

+    for job in jobs {
+        // If job was created before trigger was edited, simply update it to unsuspend
+        // instead of deleting and repushing
+        if job.created_at > trigger.edited_at {


Great Logic: Smart Job Reassignment

Excellent decision to handle two scenarios differently:

Jobs created AFTER trigger edit: Get new config (delete & recreate)

Jobs created BEFORE trigger edit: Keep original config (simple update)

This prevents the confusing scenario where a user edits a trigger and then wonders why old suspended jobs are executing with the new config. The timestamp comparison is a smart way to determine user intent.

claude · 2025-12-05T14:44:53Z

backend/windmill-api/src/triggers/listener.rs

-            "(last_server_ping IS NULL OR last_server_ping < now() - interval '15 seconds')",
-        );
+        sqlb.fields(&fields)
+            .and_where("(mode = 'enabled'::TRIGGER_MODE OR mode = 'suspended'::TRIGGER_MODE)")


Good Practice: Listener Includes Suspended Triggers

Correct implementation - listeners need to pick up both enabled and suspended triggers since suspended mode still processes events (just doesn't auto-execute jobs). This ensures the trigger continues to queue jobs even in suspended state.

Minor optimization consideration: If you have many triggers, consider adding an index on mode column for better query performance:

CREATE INDEX CONCURRENTLY idx_trigger_mode ON {table_name}(workspace_id, mode) WHERE mode IN ('enabled', 'suspended');

ellipsis-dev · 2025-12-05T14:46:28Z

backend/windmill-queue/src/jobs.rs

            JobKind::AIAgent => "jobs.run.ai_agent",
+            JobKind::UnassignedScript => "jobs.run.unassigned_script",
+            JobKind::UnassignedFlow => "jobs.run.unassigned_flow",
+            JobKind::UnassignedSinglestepFlow => "jobs.run.unassigned_singlestepflow",


Typographical note: The enum variant UnassignedSinglestepFlow appears to be using a non-standard compound word format. Consider renaming it to UnassignedSingleStepFlow (and updating the corresponding mapping to "jobs.run.unassigned_single_step_flow") for improved readability and consistency.

…-mode

dieriba and others added 30 commits November 7, 2025 17:31

first commit

7e5d201

Merge branch 'main' into dieri/triggers-queue-mode

0140022

Merge branch 'main' into dieri/triggers-queue-mode

d98f0e8

base

727c55a

ok

7dc02ab

Merge branch 'main' into dieri/triggers-queue-mode

c39aa69

remove print

6a4c94b

rm packe json

22b5dc2

fix discard

d022ba7

Merge branch 'main' into dieri/triggers-queue-mode

2854290

fix

50f39c5

update .sqlx

ba535b5

update

e637d85

update ref

9af0ad7

jobs update

94faed5

nits

f576e20

update ref

bd3f3fd

big

a4f3b4f

Merge branch 'main' into dieri/triggers-queue-mode

a713151

update sqlx

4989392

fix

596821b

update

453031c

nits and fix

7d91fa2

Merge branch 'main' into dieri/triggers-queue-mode

cac1c91

nits

1f538ce

ok

9912734

sqlx + fix import

62d0150

better

e74e04b

Merge branch 'main' into dieri/triggers-queue-mode

d58e9cd

nit

c1c2588

hugocasa added 5 commits December 2, 2025 11:40

Merge remote-tracking branch 'origin/main' into dieri/triggers-queue-…

07cbe3c

…mode

done for http triggers

cb1b406

feat: better retry/error handler check

0399f0a

generalize to all triggers

0dfefef

Merge remote-tracking branch 'origin/main' into hc/triggers-suspended…

7737cb5

…-mode

hugocasa added 5 commits December 4, 2025 14:50

fix merge conflicts

af98324

fix listener

08acca1

nits

44e21f4

Merge remote-tracking branch 'origin/main' into hc/triggers-suspended…

a816775

…-mode

nit

5166171

hugocasa changed the title ~~feat: triggers suspended mode~~ [merge ee first] feat: triggers suspended mode Dec 5, 2025

hugocasa marked this pull request as ready for review December 5, 2025 14:38

hugocasa requested review from alpetric and rubenfiszel as code owners December 5, 2025 14:38

claude bot reviewed Dec 5, 2025

View reviewed changes

ellipsis-dev bot reviewed Dec 5, 2025

View reviewed changes

hugocasa added 3 commits December 8, 2025 10:29

Merge remote-tracking branch 'origin/main' into hc/triggers-suspended…

b729a0e

…-mode

fix merge conflict

16e1c50

update ee ref

d432b5a

hugocasa changed the title ~~[merge ee first] feat: triggers suspended mode~~ feat: triggers suspended mode Dec 8, 2025

rubenfiszel merged commit eb284df into main Dec 8, 2025
8 of 9 checks passed

rubenfiszel deleted the hc/triggers-suspended-mode branch December 8, 2025 16:20

github-actions bot locked and limited conversation to collaborators Dec 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: triggers suspended mode #7297

feat: triggers suspended mode #7297

Uh oh!

hugocasa commented Dec 4, 2025 •

edited by rubenfiszel

Loading

Uh oh!

cloudflare-workers-and-pages bot commented Dec 4, 2025 •

edited

Loading

Uh oh!

claude bot commented Dec 5, 2025 •

edited

Loading

Uh oh!

claude bot Dec 5, 2025

Uh oh!

claude bot Dec 5, 2025

Uh oh!

claude bot Dec 5, 2025

Uh oh!

claude bot Dec 5, 2025

Uh oh!

claude bot Dec 5, 2025

Uh oh!

claude bot Dec 5, 2025

Uh oh!

ellipsis-dev bot Dec 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: triggers suspended mode #7297

feat: triggers suspended mode #7297

Uh oh!

Conversation

hugocasa commented Dec 4, 2025 • edited by rubenfiszel Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloudflare-workers-and-pages bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying windmill with Cloudflare Pages

Uh oh!

claude bot commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Triggers Suspended Mode ✅

🎯 Overview

✅ Strengths

1. Clean Database Migration

2. Good Abstraction with TriggerCrud Trait

3. Backward Compatibility

4. Smart Job Reassignment Logic

5. Listener Architecture

6. OpenAPI Updated

🐛 Issues Found

CRITICAL: Potential Data Loss in Down Migration

HIGH: Missing Error Context in Transaction

MEDIUM: SQL Injection Defense

MEDIUM: Inconsistent Error Reporting

LOW: Potential Race Condition in Mode Toggle

LOW: Missing Index Consideration

🎨 Code Quality Observations

Good Practices

Minor Improvements

🔒 Security Considerations

🚀 Performance Considerations

📝 Documentation & Testing

Needs Documentation

Test Coverage Recommendations:

📋 Testing Instructions

Summary

Uh oh!

claude bot Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

claude bot Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

claude bot Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

claude bot Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

claude bot Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

claude bot Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hugocasa commented Dec 4, 2025 •

edited by rubenfiszel

Loading

cloudflare-workers-and-pages bot commented Dec 4, 2025 •

edited

Loading

claude bot commented Dec 5, 2025 •

edited

Loading