This document serves as the single source of truth for testing CodeGraphContext. It consolidates usage instructions, architectural philosophy, and future roadmap items.
We provide a helper script tests/run_tests.sh to simplify test execution.
| Suite | Command | Use Case |
|---|---|---|
| All Tests | ./tests/run_tests.sh all |
Full CI/CD verification (includes E2E). |
| Fast Tests | ./tests/run_tests.sh fast |
Recommended for local dev. Runs Unit + Integration. |
| Unit Tests | ./tests/run_tests.sh unit |
Focus on individual components (parsers, database logic). |
| User Journeys | ./tests/run_tests.sh e2e |
Validate full end-to-end user workflows. |
The test suite follows the Testing Pyramid principle, ensuring a balanced mix of speed and confidence.
- Speed: Very Fast (< 100ms)
- Scope: Isolated classes and functions.
- Mocking: Heavy mocking of external dependencies (Neo4j, FileSystem).
- Content:
core/:DatabaseManager,JobManager,FileWatcher.parsers/: Output verification forTreeSitterParser(Python, JS, etc.).tools/:GraphBuilderlogic,CodeFinderquery generation.
- Speed: Fast (~1s)
- Scope: Interaction between 2+ components.
- Mocking: Partial (e.g., mock the database connection but run the real
GraphBuilderlogic). - Content:
cli/: Typer command execution, argument parsing, error handling.mcp/: Server routing, tool call validation, JSON protocol adherence.
- Speed: Slow (> 10s)
- Scope: Full system as seen by the user.
- Mocking: Minimal/None (uses real file interactions, simulated sub-processes).
- Content:
test_user_journeys.py: "User initializes repo", "User queries function callers", "User exports bundle".
- Scope: Benchmarks for large codebase indexing and complex query latency.
We simulate real-world usage to ensure the product solves user problems.
- First-time Setup: Initialization -> Indexing -> Verifying output.
- Daily Dev: Watching for file changes -> Auto-updating graph.
- Code Search: Finding functions by name, argument, or decorator.
Every command in cgc --help is tested.
index: Argument validation, force flags.find/analyze: Query construction and output formatting.mcp: Server startup and tool exposure.
We verify that our Tree-sitter parsers correctly extract:
- Structure: Classes, Functions, Modules.
- Relationships: Calls, Inheritance, Imports, Dependencies.
- Languages Covered: Python, JavaScript/TypeScript, Java, C++, Go, Rust, Ruby, PHP, Dart, Perl, and more.
- Create
tests/unit/parsers/test_<lang>_parser.py. - Use the
get_tree_sitter_manager()singleton. - Feed it a sample code string.
- Assert the structure of the returned definition dict.
- Create/Edit
tests/integration/cli/test_cli_commands.py. - Mock the underlying service (e.g.,
GraphBuilderorCodeFinder). - Use
CliRunnerfromtyper.testingto invoke the command. - Assert
result.exit_code == 0and checkresult.stdout.
- Identify the bug workflow.
- Create a test case in
tests/e2e/test_user_journeys.pythat reproduces it. - Fix the bug and verify the test passes.
These ideas are consolidated from the legacy IDEAL_TEST_PLAN.md.
- Idea: Test indexing on > 100k LoC repositories (e.g., use an archived version of React or Django as a fixture).
- Metric: Indexing time per 1000 lines, Memory usage peak.
- Idea: Use a tool like
mutmutto introduce random bugs in the code and verify the test suite catches them.
- Idea: Instead of asserting specific keys, verify the entire JSON output of a parser against a stored "snapshot". This makes updating parser tests much faster when schema changes.
- Idea: In E2E tests, spin up a real Docker container for Neo4j/FalkorDB to guarantee 100% accurate database behavior, removing all mocks.
- Idea: E2E tests simulating a user querying calls across two different repositories (microservices scenario).