A powerful, multi-backend chatbot with an enhanced Gradio interface that connects to 5 different AI providers with 100+ free models.
โ ๏ธ Important: AI model availability changes frequently. Provider APIs may add, remove, or rename models at any time. The model lists in this README are current as of December 2025 but may not reflect real-time availability.
- Ollama - Run models locally (100% free, private, no API keys)
- OpenRouter - 29 free models with
:freesuffix - GitHub Models - 23 free tier models for prototyping
- Groq - 19 models with ultra-fast inference (14,400 free req/day)
- Gemini - 17 Google AI models (1,500 free req/day)
- Modern Gradio UI with custom theme
- Real-time API status indicators
- Model dropdown with 100+ options
- System prompt customization
- Temperature & max tokens controls
- Preset prompts (Code Expert, Writer, Analyst, Teacher)
- Export chat to Markdown
- Stop generation button
- Clear chat functionality
- Streaming responses for real-time interaction
- Conversation history support
- Empty choices safety checks (fixed streaming issues)
- Custom system prompts per conversation
- Temperature control for creativity adjustment
- Error handling with helpful suggestions
- Python 3.8+
- Jupyter Notebook or VS Code with Jupyter extension
- Internet connection (for cloud providers)
- API Keys (optional, depending on providers)
git clone https://github.com/M-F-Tushar/Multi-Backend-Chatbot-with-Gradio.git
cd Multi-Backend-Chatbot-with-Gradiopip install gradio openai python-dotenv google-generativeaiOr run the first code cell in the notebook:
%pip install google-generativeai openai python-dotenv gradio -qCreate a .env file in the project directory:
# Copy the example template
cp .env.example .env
# Edit .env with your favorite text editor
notepad .env # Windows
nano .env # Linux/MacAdd your API keys (get them from the links below). Only add keys for providers you want to use:
# OpenRouter (29 free models)
OPENROUTER_API_KEY=sk-or-v1-your-key-here
# GitHub Models (23 free models)
GITHUB_TOKEN=ghp_your-token-here
# Groq (19 free models, ultra-fast)
GROQ_API_KEY=gsk_your-key-here
# Google Gemini (17 models)
GOOGLE_API_KEY=AIzaSy-your-key-hereNote: Ollama doesn't require an API key (local only).
Option A: Jupyter Notebook
jupyter notebook Chatbot.ipynbOption B: VS Code
- Open
Chatbot.ipynbin VS Code - Select Python kernel
- Run all cells (Ctrl+Shift+Enter)
Option C: JupyterLab
jupyter labThe interface will launch at http://127.0.0.1:7860
# Install Ollama
# Windows/Mac: Download from https://ollama.ai
# Linux:
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model
ollama pull llama3.2:1b
# Start Ollama (usually auto-starts)
ollama serveModels Available: 13 models including Llama, Mistral, CodeLlama, Phi, Gemma, Qwen, DeepSeek
- Visit https://console.groq.com
- Sign up with GitHub/Google
- Go to "API Keys" โ Create new key
- Copy key to
.envasGROQ_API_KEY
Free Tier Limits:
- 30 requests/minute
- 14,400 requests/day
- 19 models including Llama 3.3 70B, Mixtral, Gemma, Qwen
- Generate a Personal Access Token:
- Go to https://github.com/settings/tokens
- Click "Generate new token (classic)"
- Select scopes:
read:user,read:org - Generate and copy token
- Add to
.envasGITHUB_TOKEN
Free Tier Limits:
- 10-15 requests/minute (varies by model)
- 50-150 requests/day
- 23 models including GPT-4o, Llama, Phi, Mistral, DeepSeek
- Visit https://aistudio.google.com/app/apikey
- Sign in with Google account
- Click "Create API Key"
- Copy key to
.envasGOOGLE_API_KEY
Free Tier Limits:
- 15 requests/minute
- 1,500 requests/day
- 17 models including Gemini 2.5, Gemini 3 Preview, Gemma
- Visit https://openrouter.ai
- Sign up (email/Google/GitHub)
- Go to "Keys" โ Create new key
- Copy key to
.envasOPENROUTER_API_KEY
Free Tier Notes:
- Must append
:freeto model IDs - ~20 requests/minute
- ~50 requests/day per model
- 29 models including Llama 3.3 70B, Gemini, DeepSeek R1, Qwen, Gemma
โ ๏ธ Note: Model availability changes frequently. The lists below reflect models available as of December 2025. Some models may be added, removed, or renamed by providers. Always check the dropdown in the interface for the most current list.
llama3.2:1b, llama3.2:3b, llama3.1:8b, llama3.1:70b
mistral:7b, mixtral:8x7b, codellama:7b, codellama:34b
phi3:mini, phi3:medium, gemma2:9b, qwen2.5:7b, deepseek-r1:7b
meta-llama/llama-3.3-70b-instruct:free
google/gemini-2.0-flash-exp:free
tngtech/deepseek-r1t2-chimera:free
qwen/qwen3-coder:free
google/gemma-3-27b-it:free
nvidia/nemotron-nano-12b-v2-vl:free
... (24 more, see notebook for full list)
OpenAI: gpt-4o, gpt-4o-mini, o1-mini, o1-preview
Meta Llama: Llama-3.3-70B, Llama-3.2-90B-Vision, Llama-3.1-70B
Microsoft: Phi-4, Phi-3.5-MoE
Mistral: Mistral-Large-2407, Codestral-2501
DeepSeek: DeepSeek-V3, DeepSeek-R1
... (13 more models)
llama-3.3-70b-versatile, llama-3.1-8b-instant
openai/gpt-oss-120b, openai/gpt-oss-20b
gemma2-9b-it, mixtral-8x7b-32768
qwen-2.5-32b, deepseek-r1-distill-llama-70b
whisper-large-v3 (audio)
... (11 more models)
gemini-3-pro-preview (Preview)
gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite
gemini-2.0-flash, gemini-2.0-flash-lite
gemma-3-27b-it, gemma-3-8b-it, gemma-2-9b-it
... (8 more models)
- Launch the notebook and run all cells
- Select Backend from dropdown (Ollama/OpenRouter/GitHub/Groq/Gemini)
- Select Model from auto-populated list
- Type your message and press Enter or click Send
Click "Advanced Settings" accordion to access:
-
System Prompt - Customize AI behavior
- Example: "You are a Python expert. Provide code with explanations."
-
Temperature (0.0 - 2.0)
0.0-0.5: Focused, deterministic responses0.7: Balanced (default)1.0-2.0: Creative, varied responses
-
Max Tokens (0 - 4096)
0: Unlimited (default)512: Short responses2048: Medium responses4096: Long responses
Quick-select specialized AI personas:
- Code Expert - For programming help
- Creative Writer - For storytelling and content
- Data Analyst - For data insights and analysis
- Teacher - For educational explanations
- Click "Export Chat" accordion
- Click "Export to Markdown"
- Copy the formatted conversation
โ FIXED - Empty choices check added to streaming handler
# Make sure Ollama is running
ollama serve
# Check if models are installed
ollama list
# Pull a model if needed
ollama pull llama3.2:1b- Check
.envfile exists in project directory - Verify no extra spaces in API keys
- Restart Jupyter kernel after editing
.env - Run cell 2 again to reload environment
- OpenRouter: Add
:freesuffix to free models - GitHub: Use exact model names (e.g.,
Llama-3.3-70B-Instruct) - Groq: Use model IDs from the list above
- Gemini: Some models are preview/experimental
- Free tiers have limits (see API Keys section)
- Wait a few minutes or switch to another provider
- Consider upgrading to paid tier for higher limits
Multi-Backend-Chatbot-with-Gradio/
โโโ Chatbot.ipynb # Main notebook with enhanced Gradio UI
โโโ README.md # This documentation
โโโ .env.example # Template for API keys (rename to .env and add your keys)
โโโ .env # Your API keys (DO NOT COMMIT - protected by .gitignore)
โโโ .gitignore # Git configuration to protect sensitive files
โโโ LICENSE # MIT License
jupyter notebook Chatbot.ipynb- Fork this repo to your GitHub account
- Create new Hugging Face Space
- Connect to your GitHub repo
- Add API keys to Space secrets
- Deploy!
Would require converting Gradio to Streamlit, but feasible.
Could containerize for cloud platforms (AWS, Azure, GCP, etc.)
- Frontend: Gradio 5.0+ with custom CSS
- Backend: OpenAI-compatible API clients
- Models: 100+ free models across 5 providers
- Streaming: Real-time token-by-token responses
- Fixed GitHub Models streaming - Added empty choices check
- Enhanced UI - Modern theme, better layout, status indicators
- Model management - Dynamic dropdowns with 100+ models
- Advanced controls - System prompts, temperature, max tokens
- Error handling - Helpful error messages with solutions
gradio>=5.0
openai>=1.0
python-dotenv>=1.0
google-generativeai>=0.3
Found a bug? Want to add features? Here's how to contribute:
git clone https://github.com/YOUR-USERNAME/Multi-Backend-Chatbot-with-Gradio.git
cd Multi-Backend-Chatbot-with-Gradiogit checkout -b feature/your-feature-name- Add new features, fix bugs, improve documentation
- Test thoroughly with different models
- Update README if adding new capabilities
git add .
git commit -m "Add your descriptive message"
git push origin feature/your-feature-name- Go to GitHub repo
- Create Pull Request from your fork
- Describe your changes
- Add more AI providers (Anthropic Claude, Mistral API, etc.)
- Vision model support (image inputs)
- Voice input/output support
- Chat history persistence (SQLite/MongoDB)
- Multi-user authentication
- Custom model fine-tuning
- API usage statistics & cost tracking
- Docker containerization
- Streamlit version
- Web server deployment guide
This project is open source and available under the MIT License.
- Gradio - Beautiful web UI framework
- OpenAI - API standard that most providers follow
- Meta, Google, Mistral, Microsoft - Open-source models
- OpenRouter, Groq, GitHub - Free API access
- Check existing issues
- Create a new issue with:
- Python version
- Which provider failed
- Error message (full traceback)
- Steps to reproduce
- Check the Troubleshooting section above
- Verify your API keys in
.env - Ensure all dependencies:
pip install --upgrade gradio openai python-dotenv google-generativeai - Restart Jupyter kernel after editing
.env - Check GitHub Discussions for community help
- Open a GitHub Discussion
- Or create an Issue with
[FEATURE REQUEST]label
- Vision model support (image inputs)
- Voice input/output
- Chat history persistence
- Multi-user support
- Custom model additions
- API usage statistics
Happy Chatting! ๐
Last Updated: December 4, 2025 Repository: https://github.com/M-F-Tushar/Multi-Backend-Chatbot-with-Gradio