Skip to content

Conversation

@dstengle
Copy link
Owner

No description provided.

Add a simple FastAPI-based web application that provides:

Backend Features:
- REST API for processing markdown content (text or file upload)
- Entity query endpoints with filtering by type
- Property-based search across all entities
- Graph statistics and metrics
- RDF export in multiple formats (Turtle, JSON-LD, XML)
- In-memory graph storage for quick queries

Frontend Features:
- Bootstrap-based responsive UI
- Markdown input via textarea or file upload
- Real-time statistics dashboard (triples, entities, types)
- Entity type filtering with dynamic dropdown
- Property-based search interface
- Results display with entity cards showing all properties
- Graph export and clear functionality
- Example markdown content loader

API Endpoints:
- POST /api/process - Process markdown text
- POST /api/process/file - Upload and process file
- GET /api/entities - Query entities with optional type filter
- GET /api/entities/types - Get all entity types in graph
- GET /api/entities/search - Search by property name/value
- GET /api/graph/export - Export graph in various formats
- GET /api/stats - Get graph statistics
- DELETE /api/graph - Clear current graph

The webapp integrates directly with the Processor class and supports
all entity types: Documents, TodoItems, Headings, Sections, Lists,
Tables, CodeBlocks, Blockquotes, WikiLinks, and Named Entities.

Files:
- webapp/backend/main.py - FastAPI application
- webapp/frontend/index.html - Web interface
- webapp/frontend/static/app.js - Frontend JavaScript
- webapp/requirements.txt - Python dependencies
- webapp/README.md - Documentation and usage guide
- webapp/start.sh - Startup script
- webapp/test_webapp.py - Backend validation tests
Add comprehensive Docker support and automated publishing to Google
Container Registry via GitHub Actions.

Docker Implementation:
- Multi-stage Dockerfile with Python 3.12-slim base
- Optimized layer caching and minimal image size
- Health check endpoint configured
- Exposes port 8000 for webapp access
- Production-ready configuration

Docker Compose:
- Simple single-service configuration
- Port mapping and health checks
- Volume support for development mode
- Named network for future extensibility

GitHub Actions Workflow:
- Automated builds on push to main and claude/** branches
- Publishes to Google Container Registry (GCR)
- Multi-architecture support (amd64, arm64)
- Multiple image tagging strategies:
  * latest/stable for main branch
  * commit SHA for traceability
  * timestamp for version tracking
  * branch-specific tags
- Image verification step
- Detailed deployment summaries
- Build caching for faster CI/CD

Configuration Files:
- .dockerignore for optimized build context
- GCP_SETUP.md with detailed setup instructions
- Updated README.md with Docker usage guide

GCP Setup Documentation:
- Service account creation steps
- IAM role configuration
- GitHub secrets setup
- Local testing instructions
- Troubleshooting guide
- Security best practices
- Cost estimates

Usage:
```bash
# With Docker Compose
docker-compose up

# Direct Docker run
docker build -t kb-processor-webapp -f webapp/Dockerfile .
docker run -p 8000:8000 kb-processor-webapp

# Pull from GCR
docker pull gcr.io/PROJECT_ID/kb-processor-webapp:latest
docker run -p 8000:8000 gcr.io/PROJECT_ID/kb-processor-webapp:latest
```

Required GitHub Secrets:
- GCP_PROJECT_ID: Your GCP project ID
- GCP_SA_KEY: Service account JSON key with GCR push permissions

The workflow triggers on:
- Pushes to main or claude/** branches
- Changes to webapp/** or knowledgebase_processor/**
- Manual workflow dispatch with custom tags
Switch Docker image publishing from Google Container Registry (GCR)
to GitHub Container Registry (GHCR) for better integration with
GitHub-native workflows.

Changes:
- Replace GCR workflow with GHCR workflow
- Use GITHUB_TOKEN instead of GCP service account
- Update all documentation references from GCR to GHCR
- Simplify authentication (no external secrets needed)
- Update README with GHCR pull commands
- Create comprehensive GHCR_SETUP.md guide
- Remove GCP_SETUP.md documentation
- Update docker-compose.yml with GHCR image option

Benefits of GHCR over GCR:
✅ No external setup required - works automatically
✅ Uses built-in GITHUB_TOKEN (no secrets to manage)
✅ Free for public images (unlimited storage/bandwidth)
✅ Native GitHub integration
✅ Automatic authentication in GitHub Actions
✅ Simpler configuration and troubleshooting

GitHub Actions Workflow:
- Publishes to ghcr.io/owner/repo/kb-processor-webapp
- Authenticates with GITHUB_TOKEN automatically
- Multi-architecture builds (amd64, arm64)
- Same tagging strategy (latest, stable, SHA, timestamp, branch)
- No repository secrets required

Image URLs:
- Latest: ghcr.io/owner/repo/kb-processor-webapp:latest
- By SHA: ghcr.io/owner/repo/kb-processor-webapp:a1b2c3d
- By branch: ghcr.io/owner/repo/kb-processor-webapp:branch-name

Documentation:
- GHCR_SETUP.md: Complete setup guide with authentication options
- README.md: Updated with GHCR commands and references
- docker-compose.yml: Shows both local build and GHCR image options

The workflow triggers automatically on pushes to main or claude/**
branches without any additional configuration.
@dstengle dstengle merged commit 63d7883 into main Nov 18, 2025
5 of 6 checks passed
@dstengle dstengle deleted the claude/test-webapp-processor-018uvEwV1epEBAKHfie6mvSY branch November 18, 2025 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants