Self-Hosted Budget AI API

A beautiful, self-hosted AI assistant powered by Qwen2-0.5B-Instruct with a modern React frontend and secure Python FastAPI backend.

📋 API

  -H "Content-Type: application/json" \
  -H "X-API-Key: your-secure-api-key-here" \
  -d '{"prompt": "Test with valid key"}'

🚀 Features

Modern UI: Beautiful Tailwind CSS React frontend with animations and responsive design
Self-Hosted AI: Uses Qwen2-0.5B-Instruct model for local AI inference
Secure: API key authentication and IP whitelisting
Production Ready: Includes deployment scripts with Fabric
Real-time Chat: Interactive chat interface with loading animations
Copy to Clipboard: Easy copying of AI responses

📋 Prerequisites

Python 3.8+
Node.js 16+
Git
CUDA-compatible GPU (optional, for faster inference)

🛠️ Installation

1. Clone the Repository

git clone https://github.com/yourusername/self-hosted-budget-ai-api.git
cd self-hosted-budget-ai-api

2. Backend Setup

cd backend

# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# The model will be automatically downloaded on first run
# This may take several minutes depending on your internet connection

3. Frontend Setup

cd ../frontend

# Install dependencies
npm install

# Build for production (optional)
npm run build

🔧 Configuration

Backend Configuration

The backend uses environment variables and configuration files:

Environment Variables (.env file):

DEV_MODE=true
API_KEYS_FILE=config/api_keys.txt
WHITELIST_FILE=config/whitelist.txt
MODEL_CACHE_DIR=models
MAX_NEW_TOKENS=512
TEMPERATURE=0.7
HOST=0.0.0.0
PORT=8000

API Keys (config/api_keys.txt):

demo-key-12345
your-secure-api-key-here

IP Whitelist (config/whitelist.txt):

127.0.0.1
::1
192.168.1.0/24
10.0.0.0/8

Frontend Configuration

The frontend automatically connects to http://localhost:8000 in development mode.

🚀 Running the Application

Development Mode

Start the Backend:

cd backend
source venv/bin/activate
python -m app.main

Start the Frontend (in a new terminal):

cd frontend
npm run dev

Access the Application:
- Frontend: http://localhost:5173
- Backend API: http://localhost:8000

Production Mode

Use the provided Fabric deployment scripts:

cd backend
fab setup --host=your-server.com --user=deploy
fab deploy --host=your-server.com --user=deploy

📖 API Documentation

Generate Text

POST /api/generate

Headers:

Content-Type: application/json
X-API-Key: your-api-key

Request Body:

{
  "prompt": "Your question or prompt here"
}

Response:

{
  "response": "AI generated response"
}

🤖 Model Information

This application uses Qwen2-0.5B-Instruct, a compact yet powerful language model:

Size: ~500MB
Context Length: 32K tokens
Languages: Multilingual support
Performance: Optimized for efficiency and speed

Model Download

The model will be automatically downloaded on first run to the models/ directory. This includes:

Model weights
Tokenizer files
Configuration files

Download Size: ~500MB Disk Space Required: ~1GB (including cache)

🔒 Security Features

API Key Authentication: All requests require valid API keys
IP Whitelisting: Restrict access to specific IP addresses/ranges
Development Mode: Automatic localhost access in dev mode
CORS Protection: Configured for secure cross-origin requests

🚀 Deployment

Using Fabric (Recommended)

Initial Server Setup:

fab setup --host=your-server.com --user=deploy

Deploy Application:

fab deploy --host=your-server.com --user=deploy

Available Commands:

fab status    # Check application status
fab logs      # View application logs
fab rollback  # Rollback to previous version
fab backup_config  # Backup configuration files

Manual Deployment

Server Requirements:
- Ubuntu 20.04+ or similar
- Python 3.8+
- Node.js 16+
- Nginx
- PM2 (for process management)
Setup Steps:
- Clone repository to /var/www/self-hosted-budget-ai-api
- Install dependencies
- Configure Nginx reverse proxy
- Start services with PM2

🎨 Frontend Features

Modern Design: Glassmorphism UI with gradient backgrounds
Responsive: Works on desktop, tablet, and mobile
Animations: Smooth transitions with Framer Motion
Dark Theme: Beautiful dark theme with purple/cyan accents
Real-time Chat: Interactive chat interface
Loading States: Animated loading indicators
Error Handling: User-friendly error messages
Copy Functionality: One-click copying of AI responses

🛠️ Development

Project Structure

self-hosted-budget-ai-api/
├── backend/
│   ├── app/
│   │   ├── main.py          # FastAPI application
│   │   ├── models.py        # AI model handling
│   │   ├── auth.py          # Authentication
│   │   └── config.py        # Configuration
│   ├── config/              # Configuration files
│   ├── deploy/              # Deployment scripts
│   └── requirements.txt     # Python dependencies
├── frontend/
│   ├── src/
│   │   ├── App.jsx         # Main React component
│   │   └── main.jsx        # React entry point
│   ├── package.json        # Node.js dependencies
│   └── tailwind.config.js  # Tailwind configuration
└── nginx/                  # Nginx configuration

Adding New Features

Backend: Add new endpoints in app/main.py
Frontend: Modify src/App.jsx for UI changes
Styling: Use Tailwind CSS classes
Deployment: Update Fabric scripts as needed

🐛 Troubleshooting

Common Issues

Model Download Fails:
- Check internet connection
- Ensure sufficient disk space
- Try clearing the models/ directory
CUDA Out of Memory:
- Reduce MAX_NEW_TOKENS in .env
- Use CPU inference by setting CUDA_VISIBLE_DEVICES=""
Frontend Build Fails:
- Clear node_modules and reinstall: rm -rf node_modules && npm install
- Check Node.js version compatibility
API Authentication Errors:
- Verify API key in config/api_keys.txt
- Check IP whitelist in config/whitelist.txt

Performance Optimization

GPU Acceleration: Ensure CUDA is properly installed
Model Caching: Keep the models/ directory for faster startup
Memory Management: Monitor system resources during inference

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

📞 Support

For issues and questions:

Create an issue on GitHub
Check the troubleshooting section
Review the API documentation

🙏 Acknowledgments

Qwen Team: For the excellent Qwen2-0.5B-Instruct model
Hugging Face: For the transformers library
FastAPI: For the amazing web framework
React & Tailwind: For the beautiful frontend stack

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
backend		backend
frontend		frontend
nginx		nginx
.gitignore		.gitignore
README.md		README.md

eshaam/self-hosted-budget-ai-api

Folders and files

Latest commit

History

Repository files navigation