Skip to content

feat(playground): add GitHub URL validation endpoint (#124)#130

Merged
DevanshuNEU merged 2 commits into
OpenCodeIntel:mainfrom
DevanshuNEU:feat/124-validate-github-url
Dec 25, 2025
Merged

feat(playground): add GitHub URL validation endpoint (#124)#130
DevanshuNEU merged 2 commits into
OpenCodeIntel:mainfrom
DevanshuNEU:feat/124-validate-github-url

Conversation

@DevanshuNEU

Copy link
Copy Markdown
Collaborator

Summary

Implements GitHub URL validation endpoint for anonymous indexing (Epic #114).

Changes

New Endpoint

POST /api/v1/playground/validate-repo

Request

{
  "github_url": "https://github.com/user/repo"
}

Response Cases

Valid & Can Index:

{
  "valid": true,
  "repo_name": "repo",
  "owner": "user",
  "is_public": true,
  "default_branch": "main",
  "file_count": 150,
  "size_kb": 2048,
  "language": "Python",
  "stars": 1234,
  "can_index": true,
  "message": "Ready to index"
}

Too Large (>200 files):

{
  "valid": true,
  "can_index": false,
  "reason": "too_large",
  "message": "Repository has 2,500 code files. Anonymous limit is 200."
}

Private Repository:

{
  "valid": true,
  "is_public": false,
  "can_index": false,
  "reason": "private"
}

Not Found:

{
  "valid": false,
  "reason": "not_found"
}

Implementation Details

  • URL parsing with regex (extracts owner/repo)
  • GitHub API calls for repo metadata and file tree
  • Code file filtering using RepoValidator.CODE_EXTENSIONS
  • Skips node_modules, .git, etc.
  • Handles truncated trees (estimates from repo size)
  • 5-minute cache for validation results
  • Proper error handling (404, 403 rate limit, timeouts)

Testing

  • 25 test cases covering:
    • URL parsing (valid/invalid formats)
    • Request model validation
    • GitHub API mocking (404, 403, success, timeout)
    • File counting logic
    • Directory exclusions

Files Changed

  • backend/routes/playground.py - Added endpoint + helpers
  • backend/tests/test_validate_repo.py - New test file

Closes #124

- Add POST /api/v1/playground/validate-repo endpoint
- Parse and validate GitHub URLs (owner/repo extraction)
- Fetch repo metadata from GitHub API (public/private, stars, language)
- Count code files using tree API (filters by CODE_EXTENSIONS)
- Handle edge cases: 404, rate limits, truncated trees, timeouts
- Cache validation results (5 min TTL)
- Enforce 200-file limit for anonymous indexing
- Add 25 comprehensive tests

Closes OpenCodeIntel#124
@vercel

vercel Bot commented Dec 25, 2025

Copy link
Copy Markdown

@DevanshuNEU is attempting to deploy a commit to the Dev's projects Team on Vercel.

A member of the Team first needs to authorize it.

- Fix repo_manager mocking to use patch.object on instance
- Add early patching in conftest.py to prevent import-time errors
- Fix auth module reload cleanup in DevApiKeySecurity tests
- Remove module-level sys.modules pollution from test_validate_repo.py
- Clean up lint issues (unused imports, line length, whitespace)

All 139 tests now pass consistently.
@vercel

vercel Bot commented Dec 25, 2025

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Review Updated (UTC)
opencodeintel Ignored Ignored Preview Dec 25, 2025 10:46pm

@DevanshuNEU DevanshuNEU merged commit 01c18f0 into OpenCodeIntel:main Dec 25, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(backend): Repository validation endpoint for anonymous indexing

1 participant