Skip to content

fix(reports): add network timeouts so schedules can't hang forever#41250

Open
rusackas wants to merge 2 commits into
masterfrom
fix/40047-report-network-timeouts
Open

fix(reports): add network timeouts so schedules can't hang forever#41250
rusackas wants to merge 2 commits into
masterfrom
fix/40047-report-network-timeouts

Conversation

@rusackas

Copy link
Copy Markdown
Member

SUMMARY

Alert/report execution makes several network calls with no socket timeout, so the underlying Python socket blocks indefinitely if the remote endpoint becomes unreachable. When that happens mid-execution, the report schedule is left in the WORKING state forever — every subsequent scheduler tick then raises ReportSchedulePreviousWorkingError ("Report Schedule is still working, refusing to re-compute"), and manually resetting the state via SQL only causes the next run to wedge the same way.

This matches the symptoms in #40047 exactly: reports stuck in "sending" across all formats (CSV/PNG/PDF), affecting both new and existing reports, with no logs after the run starts, appearing after an environment change (e.g. SMTP host or WEBDRIVER_BASEURL becoming unreachable) without any image rebuild. Because the hang is in a blocking C-level socket read, Celery's soft_time_limit often can't interrupt it cleanly, and the working_timeout sweep only fires on a later tick — so the schedule stays wedged.

Three previously-unbounded calls are now bounded by configurable timeouts:

Call File New config (default)
smtplib.SMTP / SMTP_SSL email send superset/utils/core.py SMTP_TIMEOUT (30s)
urllib chart-data fetch for CSV/dataframe attachments superset/utils/csv.py ALERT_REPORTS_CSV_REQUEST_TIMEOUT (60s)
Selenium driver.get() navigation superset/utils/webdriver.py SCREENSHOT_PAGE_LOAD_WAIT (120s) via set_page_load_timeout

With a finite timeout the failing call now raises instead of hanging; the report state machine transitions the schedule to ERROR, the failure is surfaced/retried, and the worker is freed. The SMTP/CSV timeouts fall back to their defaults for custom configs that predate the new keys. All defaults can be set to None to restore the old unbounded behavior.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A (backend reliability fix).

TESTING INSTRUCTIONS

  • Unit: pytest tests/unit_tests/utils/csv_tests.py tests/unit_tests/utils/webdriver_test.py
  • Integration: pytest tests/integration_tests/email_tests.py -k send_mime
  • Manual: point SMTP_HOST (or WEBDRIVER_BASEURL) at an unroutable address (e.g. a blackhole IP) and trigger a report. Before this change the worker hangs and the schedule sticks in WORKING; after, the call times out, the report moves to ERROR, and the error is logged.

New tests cover: the SMTP timeout being passed (and defaulting when the key is absent), the CSV fetch forwarding the timeout to opener.open, and the Selenium driver applying / skipping set_page_load_timeout.

ADDITIONAL INFORMATION

🤖 Generated with Claude Code

Alert/report execution made several network calls with no socket
timeout, so a Python socket would block indefinitely if the remote
endpoint became unreachable. When that happened mid-execution the report
schedule was left in the WORKING state permanently: every subsequent
tick then hit ReportSchedulePreviousWorkingError ("still working,
refusing to re-compute"), and manually resetting the state only led to
the next run wedging the same way. This explains reports getting stuck
in "sending" across all formats (CSV/PNG/PDF) after an environment
change, with no logs after the run starts.

Three unbounded calls are now bounded by configurable timeouts:

- SMTP send (superset/utils/core.py): smtplib.SMTP/SMTP_SSL get
  SMTP_TIMEOUT (default 30s). Affects every email-based notification.
- Chart data fetch (superset/utils/csv.py): urllib opener.open() gets
  ALERT_REPORTS_CSV_REQUEST_TIMEOUT (default 60s). Used by CSV/dataframe
  report attachments.
- Selenium navigation (superset/utils/webdriver.py): driver.get() is
  bounded by SCREENSHOT_PAGE_LOAD_WAIT (default 120s) via
  set_page_load_timeout, so an unreachable page raises TimeoutException
  instead of blocking.

With a finite timeout the failing call now raises, the state machine
transitions the schedule to ERROR, and the failure is surfaced/retried
instead of silently wedging the worker.

Fixes #40047

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@dosubot dosubot Bot added the alert-reports Namespace | Anything related to the Alert & Reports feature label Jun 19, 2026
@bito-code-review

bito-code-review Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #8e6513

Actionable Suggestions - 0
Review Details
  • Files reviewed - 8 · Commit Range: 479fc68..479fc68
    • superset/commands/report/execute.py
    • superset/config.py
    • superset/utils/core.py
    • superset/utils/csv.py
    • superset/utils/webdriver.py
    • tests/integration_tests/email_tests.py
    • tests/unit_tests/utils/csv_tests.py
    • tests/unit_tests/utils/webdriver_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment thread tests/unit_tests/utils/webdriver_test.py Outdated
@bito-code-review

Copy link
Copy Markdown
Contributor

The flagged issue regarding missing type hints in the newly added test method is correct. To resolve this, you should update the method signature to include explicit type hints for all parameters and the return type (e.g., -> None).

Since the file tests/unit_tests/utils/webdriver_test.py is not present in the provided PR diff, I cannot implement the fix directly. Please apply the following pattern to the test method in that file:

def test_your_method_name(patched_arg: PatchedType) -> None:
    # implementation

Regarding other comments, there are no additional review comments available in the provided PR context to check.

Comment thread tests/integration_tests/email_tests.py
Comment thread tests/integration_tests/email_tests.py
Comment thread tests/unit_tests/utils/csv_tests.py Outdated
Comment thread tests/unit_tests/utils/csv_tests.py Outdated
Comment thread tests/unit_tests/utils/csv_tests.py Outdated
Comment thread tests/unit_tests/utils/csv_tests.py Outdated
Comment thread tests/unit_tests/utils/csv_tests.py Outdated
@netlify

netlify Bot commented Jun 19, 2026

Copy link
Copy Markdown

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit 479fc68
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/6a3579d516184e0009d1537e
😎 Deploy Preview https://deploy-preview-41250--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

@codecov

codecov Bot commented Jun 19, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 90.90909% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 64.35%. Comparing base (79cfe4d) to head (07a3244).
⚠️ Report is 26 commits behind head on master.

Files with missing lines Patch % Lines
superset/utils/webdriver.py 66.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #41250      +/-   ##
==========================================
- Coverage   64.36%   64.35%   -0.01%     
==========================================
  Files        2651     2652       +1     
  Lines      144812   144869      +57     
  Branches    33417    33425       +8     
==========================================
+ Hits        93208    93237      +29     
- Misses      49935    49958      +23     
- Partials     1669     1674       +5     
Flag Coverage Δ
hive 39.33% <27.27%> (+<0.01%) ⬆️
mysql 58.05% <90.90%> (+<0.01%) ⬆️
postgres 58.12% <90.90%> (-0.01%) ⬇️
presto 40.91% <27.27%> (+<0.01%) ⬆️
python 59.56% <90.90%> (-0.01%) ⬇️
sqlite 57.78% <90.90%> (+<0.01%) ⬆️
unit 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bito-code-review

bito-code-review Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #9a56c9

Actionable Suggestions - 0
Filtered by Review Rules

Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.

  • tests/unit_tests/utils/csv_tests.py - 1
  • tests/integration_tests/email_tests.py - 1
Review Details
  • Files reviewed - 3 · Commit Range: 479fc68..07a3244
    • tests/integration_tests/email_tests.py
    • tests/unit_tests/utils/csv_tests.py
    • tests/unit_tests/utils/webdriver_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

alert-reports Namespace | Anything related to the Alert & Reports feature size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

v5 - Reports stuck in sending state

2 participants