feat: add configurable timeout for crawl operations #136

sammyjoyce · 2025-11-29T07:18:09Z

Summary

Add FIRECRAWL_CRAWL_TIMEOUT and FIRECRAWL_CRAWL_POLL_INTERVAL environment variables to control crawl job behavior
Prevents the MCP server from hanging indefinitely on long-running crawls

Problem

The firecrawl_crawl tool calls client.crawl() which polls indefinitely until the crawl job completes. For large sites or slow self-hosted instances, this can cause the MCP server to appear frozen/hung, making the tool unusable.

Solution

Default timeout: 120 seconds - crawl will fail gracefully if not completed within this time
Configurable: Set FIRECRAWL_CRAWL_TIMEOUT to adjust (in seconds)
Disable timeout: Set FIRECRAWL_CRAWL_TIMEOUT=0 to restore the original indefinite wait behavior
Poll interval: Configurable via FIRECRAWL_CRAWL_POLL_INTERVAL (default: 2 seconds)

Changes

src/index.ts: Parse timeout/poll interval from environment variables, pass to client.crawl()
README.md: Document new configuration options

Fixes #103

Add FIRECRAWL_CRAWL_TIMEOUT and FIRECRAWL_CRAWL_POLL_INTERVAL environment variables to control crawl job behavior. Previously, the crawl tool would wait indefinitely for a crawl job to complete, which could cause the MCP server to hang for long-running crawls. Changes: - Default timeout: 120 seconds (configurable via FIRECRAWL_CRAWL_TIMEOUT) - Set FIRECRAWL_CRAWL_TIMEOUT=0 to disable timeout (wait indefinitely) - Default poll interval: 2 seconds (configurable via FIRECRAWL_CRAWL_POLL_INTERVAL) - Updated tool description to document timeout behavior - Updated README with new configuration options Fixes firecrawl#103

The SDK's getCrawlStatus() has autoPaginate=true by default, which causes an infinite loop on self-hosted Firecrawl instances where the 'next' URL in pagination responses always points to the same URL (e.g., ?skip=0 never increments). Changed crawl implementation to: 1. Use startCrawl() + manual polling loop instead of crawl() 2. Pass autoPaginate: false to getCrawlStatus() 3. Implement proper timeout checking in the polling loop This ensures crawls either complete or timeout gracefully, rather than hanging indefinitely.

- Simplified tool API with clear, focused descriptions - Added batch scrape, map, cancel, and status tools - Improved async crawl with proper polling and timeout handling - Cleaner code structure with shared schemas - Better documentation for each tool's use case

sammyjoyce added 3 commits November 29, 2025 17:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add configurable timeout for crawl operations #136

feat: add configurable timeout for crawl operations #136

Uh oh!

sammyjoyce commented Nov 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: add configurable timeout for crawl operations #136

Are you sure you want to change the base?

feat: add configurable timeout for crawl operations #136

Uh oh!

Conversation

sammyjoyce commented Nov 29, 2025

Summary

Problem

Solution

Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant