SpudScout

An Agentic Visual Web Scraper capable of complex automation without the "Selector Hell" of modern, obfuscated web apps. By leveraging Canny Edge Detection and Visual LMs, SpudScout navigates the web like a human: by looking at the interface, not just the source code.

Why SpudScout?

Traditional scrapers break when the DOM structure changes. SpudScout maintains resilience by prioritizing visual landmarks over brittle CSS selectors.

Theme-Agnostic: Grayscale + Canny processing ensures UI landmarks are identified regardless of Dark/Light mode transitions.
Privacy-First: 100% local execution. No screenshots or data ever leave your machine for third-party API processing. (On-Hold)
Resource Lean: Architected for CPU-only environments using GGUF quantization for local inference.

Tech Stack & Constraints

We intentionally limit our scope to master the fundamentals of Computer Vision (CV) and Browser Automation.

Logic: Python 3.11+
Automation: Playwright (Synchronous) — Chosen for predictable state management.
Vision: OpenCV (Grayscale + Canny) & NumPy.
OCR: Tesseract. (Fallback for text-region validation)
Brain: Ollama. (GGUF Models) — CPU-optimized local inference.

Installation (Arch/EndevourOS)

1. System Dependencies

sudo pacman -S tesseract tesseract-data-eng opencv hdf5

2. Environment Setup.

python -m venv venv
source venv/bin/activate
pip install playwright opencv-python numpy pytesseract
playwright install chromium

Core Values

1. Coordinate Math & Scaling. We do not trust raw coordinate values, so SpudScout calculates the Device Scale Factor (DSF) to map screenshot pixels to viewport points.

**Constraint:** Always verify (Viewport × DSF) == ScreenshotWidth

2. Humanity-First Scraping. Since we are guests on the web, SpudScout enforces the following rules:

Jittered Latency: No "inhuman" clicking speeds.
Robots.txt Respect: Automatic parsing and adherence.
Custom User-Agents: Transparent identification.

Roadmap

Here is the following phase-map for the project:

Phase 1: CV-based button detection. (Canny Edge)
Phase 2: GGUF-integrated intent parsing. (Ollama)
Phase 3: Autonomous "Spud-Loops" for multi-page navigation.

Contributing

This is a "Professional Grade" lab. We value Deep Work over "Quick Fixes" in our codebase. If ou are submitting a PR, expect a deep review. We do not want "It works"; we want to know why this is a better use of the resources for the task it completes.

Important Notes & Acknowledgements

This project is created by Human developers with the help of AI-Assistance.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
DEVLOG.md		DEVLOG.md
LICENSE		LICENSE
README.md		README.md
processors.py		processors.py
scout.py		scout.py
state_reader.py		state_reader.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpudScout

Why SpudScout?

Tech Stack & Constraints

Installation (Arch/EndevourOS)

Core Values

Roadmap

Contributing

Important Notes & Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SpudScout

Why SpudScout?

Tech Stack & Constraints

Installation (Arch/EndevourOS)

Core Values

Roadmap

Contributing

Important Notes & Acknowledgements

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages