Skip to content

Latest commit

 

History

History
37 lines (24 loc) · 579 Bytes

File metadata and controls

37 lines (24 loc) · 579 Bytes

doc-scraper

A Python tool to scrape website content for LLM context.

Installation

pip install doc-scraper

Post-Installation Setup

This tool uses crawl4ai which relies on Playwright. After installing the package, you need to install the browser binaries:

playwright install

Or if you are using the crawl4ai CLI directly:

crawl4ai-doctor

Usage

doc-scraper <url>

Example:

doc-scraper https://docs.reducto.ai/

Follow the interactive prompts to configure the output directory and other settings.