This document outlines the standard operating procedures for developing, maintaining, and deploying the News Synthesizer application. These procedures ensure consistent handling of RSS feeds, LLM inference, persona management, and system operations while maintaining data privacy and performance.
- Verify hardware requirements (GPU, RAM) for llama.cpp
- Clone project structure and initialize submodules
- Set up virtual environment for Python backend
- Download and place GGUF model file
- Configure RSS feed sources and persona files
- Initialize Git repository with departmental branch structure
- Pull latest changes and update dependencies
- Review open issues and prioritize tasks
- Test local model inference and RSS fetching
- Implement changes following branching strategy:
main→ Stable production releasesdev→ Development integration branchfeature/{component}→ New features (rss-parsing, llm-inference, etc.)hotfix/{issue}→ Critical bug fixes
- Follow standards.md for code formatting and conventions
- Use type hints in Python (mypy compliance)
- Implement comprehensive error handling for RSS failures
- Validate LLM outputs against expected formats
- Verify GGUF file integrity before loading
- Test inference with sample prompts
- Monitor memory usage during initial inference
- Establish baseline performance metrics
- Daily feed validation: Check feed URLs are accessible
- Content deduplication: Use hashing and cache checks
- Processing prioritization: Freshness, relevance, quality scores
- Error handling: Circuit breaker for failed feed fetches
- Backup existing persona files before modification
- Test persona changes with sample compositions
- Validate YAML schema compliance
- Update frontend integration if new parameters added
- Monitor chat request patterns for abuse detection
- Log persona adjustments with timestamps
- Backup chat session data for continuity
- Performance: Participate in TTS generation
- Test audio output device compatibility
- Monitor TTS service availability and fallback options
- Configure voice settings for persona consistency
- Handle audio queuing and playback errors gracefully
- Confirm model loading and inference capability
- Test RSS feed connectivity and processing
- Verify TTS service and audio playback
- Check system resource usage (CPU, RAM, GPU)
- Update dependencies and patch security vulnerabilities
- Review performance logs and optimize bottlenecks
- Backup model files and configuration data
- Test disaster recovery procedures
- Analyze RSS processing success rates
- Review inference quality and accuracy metrics
- Assess persona usage and effectiveness
- Plan infrastructure upgrades based on growth
- Identify affected feeds and alternative sources
- Implement temporary fallbacks if available
- Notify users of processing delays
- Document root cause and implement fixes
- Switch to CPU-only inference if GPU fails
- Restart services and test model loading
- Check memory allocation and clear caches
- Revert to backup model if current model corrupted
- Implement rate limiting on RSS fetches
- Queue requests during high load periods
- Monitor resource usage with alerting
- Scale model instances if possible
- Daily backups of processed articles database
- Weekly backups of model files and configurations
- Monthly checkpoints of complete system state
- Recovery procedures: Step-by-step restoration with validation tests
- Update relevant .md files after all changes
- Maintain checklist.md and ledger.md concurrently
- Document all operational changes and decisions
- Review documentation accuracy monthly
- SOP review completed monthly
- All procedures tested in staging environment
- Backup procedures verified
- Emergency contacts documented
| Procedure | Last Reviewed | Status | Notes |
|---|---|---|---|
| Daily Operations | 2025-10-28 | Active | GPU monitoring added |
| Model Management | 2025-10-28 | Active | GGUF integrity checks |
| Incident Response | 2025-10-28 | Planned | Templates for common issues |