A no-code, end-to-end automated machine learning platform for training, evaluating, and deploying ML models—right from your browser.
Short Description: AutoML Studio Pro is an interactive, browser-based platform that automates the complete machine learning workflow—from data upload and exploratory analysis to model training, evaluation, and prediction—without writing a single line of code.
- Overview
- Key Features
- Tech Stack
- Getting Started
- CI & Security Checks
- Project Structure
- Usage
- Architecture
- Project Governance
- Suggested Repository Topics
- Contributing
- Support This Project
- License
- Contact
AutoML Studio Pro eliminates the complexity of building machine learning models. Upload a CSV dataset, select a target column, and let the platform handle the rest—preprocessing, model selection, training, evaluation, and export.
Built with Streamlit and Scikit-Learn, the application is designed for:
- Beginners who want to explore ML without writing code.
- Students looking for an educational tool with exportable Python scripts.
- Practitioners who need quick baseline models and batch-prediction capabilities.
| Feature | Description |
|---|---|
| Automatic Task Detection | Determines whether the problem is classification or regression based on the target column. |
| Automated Preprocessing | Handles missing values, encoding, and feature scaling via Scikit-Learn pipelines. |
| Intelligent Model Selection | Trains multiple models (RandomForest, GradientBoosting, XGBoost, etc.) and selects the best. |
| Imbalanced Data Handling | Applies SMOTE oversampling for skewed classification datasets. |
| Hyperparameter Tuning | Lightweight tuning with Optuna integration for advanced optimization. |
| Ensemble Models | Combine multiple models using voting or stacking ensembles. |
| Feature | Description |
|---|---|
| Exploratory Data Analysis | Generates descriptive statistics, outlier detection, and correlation heatmaps. |
| Advanced EDA Analytics | Six comprehensive tabs: Statistics, Target Analysis, Correlations, Distributions, Data Quality, Variance. |
| Feature Importance (XAI) | Uses permutation importance and SHAP values to explain model predictions. |
| Performance Metrics | Displays confusion matrices, accuracy scores, R² scores, and prediction plots. |
| Cross-Validation Visualization | Bar charts and histograms showing CV scores across models. |
| Feature | Description |
|---|---|
| Polynomial Features | Automatically creates polynomial features to capture non-linear relationships. |
| Interaction Features | Generates feature interactions to discover combined effects. |
| Statistical Aggregations | Creates row-wise statistics (mean, std, min, max) for numeric features. |
| Missing Value Strategies | Choose from median, mean, most_frequent, or constant imputation. |
| Preprocessing Preview | Visual preview of all preprocessing steps before training. |
| Feature | Description |
|---|---|
| SHAP Explainable AI | Model interpretability with SHAP values and feature explanations. |
| Optuna Optimization | Advanced hyperparameter optimization with configurable trials. |
| NLP/Text Classification | TF-IDF text preprocessing with configurable n-gram ranges. |
| Time Series Forecasting | ARIMA and Exponential Smoothing models for temporal data. |
| Data Versioning | Track and compare datasets across versions with MD5 hashing. |
| Feature | Description |
|---|---|
| Built-in Datasets | Iris, Wine, Breast Cancer, and Diabetes datasets for quick demos. |
| One-Click Loading | Load sample datasets instantly without uploading files. |
| Feature | Description |
|---|---|
| Model Export | Download trained models as .zip archives for reuse. |
| Python Code Export | Export a ready-to-run train_model.py script for learning and customization. |
| PDF Report Export | Generate HTML reports with model details and metrics. |
| Batch Predictions | Upload CSV files to generate predictions at scale. |
| Dynamic Prediction Form | Auto-generated input form based on dataset schema for single predictions. |
| Model History | Track and compare previously trained models with visualizations. |
| Feature | Description |
|---|---|
| Dark/Light Theme | Toggle between dark and light themes via sidebar. |
| Responsive Design | Optimized for desktop, tablet, and mobile devices. |
| Real-time Feedback | Live status updates during training with progress indicators. |
| Layer | Technology |
|---|---|
| Frontend / UI | Streamlit |
| Machine Learning | Scikit-Learn — RandomForest, GradientBoosting, XGBoost, Pipelines |
| Hyperparameter Optimization | Optuna |
| Explainable AI | SHAP |
| Time Series | Statsmodels — ARIMA, Exponential Smoothing |
| Data Processing | Pandas, NumPy |
| Visualization | Matplotlib, Seaborn |
| Model Serialization | Joblib |
| Imbalance Handling | Imbalanced-Learn (SMOTE) |
- Python 3.9 or higher
- pip (Python package manager)
# 1. Clone the repository
git clone https://github.com/himanshu231204/AutoML-Studio-Pro-.git
cd AutoML-Studio-Pro-
# 2. (Recommended) Create and activate a virtual environment
python -m venv venv
# Windows
venv\Scripts\activate
# macOS / Linux
source venv/bin/activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Launch the application
streamlit run app.pyThe application will open automatically in your default browser at http://localhost:8501.
If you prefer running the app in a container, use one of the following options.
Prerequisites:
- Docker Desktop installed and running
Option 1: Docker Compose (recommended)
# Build and run in detached mode
docker compose up -d --build
# Open the app
# http://localhost:8501
# Stop containers
docker compose downOption 2: Docker CLI
# Build image
docker build -t automl-studio-pro .
# Run container
docker run -d -p 8501:8501 --name automl_studio_pro automl-studio-pro
# Stop and remove container
docker stop automl_studio_pro
docker rm automl_studio_proNotes:
- The app is exposed on port
8501. requirements.txtis UTF-8 encoded for Linux container compatibility.
Troubleshooting:
- Docker command not found:
Install Docker Desktop and restart terminal, then run
docker --version. - Docker daemon is not running: Start Docker Desktop, wait until status is "Engine running", then retry.
- Port 8501 already in use:
Run with a different host port, for example
docker run -d -p 8502:8501 --name automl_studio_pro automl-studio-pro. - Container exits immediately:
Check logs with
docker logs automl_studio_pro. - Dependency changes are not reflected:
Rebuild image with
docker compose up -d --buildordocker build --no-cache -t automl-studio-pro ..
This repository uses GitHub Actions for linting, tests, and dependency security scanning.
- CI uses Safety to scan dependency files directly:
requirements.txtrequirements-dev.txt
- This avoids false failures from transient environment/toolchain packages and keeps scans focused on declared project dependencies.
ruff check .
python -m compileall app.py automl_app tests
pytest -q
safety check -r requirements.txt
safety check -r requirements-dev.txt├── .github/ # CI/CD and community files
│ ├── ISSUE_TEMPLATE/ # Bug report and feature request templates
│ └── workflows/ # GitHub Actions pipelines (ci.yml, cd.yml, tests.yml)
├── .streamlit/ # Streamlit Cloud configuration
│ └── config.toml # Server and theme settings
├── artifacts/ # Auto-generated models & schema files
├── assets/
│ └── images/
│ ├── badges/ # Local badge assets (optional)
│ └── screenshots/ # UI screenshots for README/docs
├── automl_app/
│ ├── core/ # Shared config and helper utilities
│ │ ├── config.py # Page setup, theming, CSS
│ │ └── helpers.py # ML utilities (preprocessing, model selection, etc.)
│ └── ui/ # Reusable UI components and tab modules
│ ├── tabs/ # Streamlit tab modules
│ │ ├── train.py # Training tab with all Phase 1 & 2 features
│ │ ├── analysis.py # Advanced EDA with 6 sub-tabs
│ │ ├── prediction.py # Single and batch predictions
│ │ ├── manual.py # User guide
│ │ └── developer.py # Developer info
│ └── footer.py # Shared footer component
├── docs/
│ ├── api/ # API/exported interface docs
│ └── guides/ # User and developer guides
├── tests/ # Unit tests
│ ├── test_phase1_features.py # Phase 1 feature tests
│ ├── test_helpers.py # Helper function tests
│ └── test_train_utils.py # Training utility tests
├── app.py # Main Streamlit application
├── Dockerfile # Multi-stage Docker build
├── docker-compose.yml # Docker Compose configuration
├── requirements.txt # Production dependencies
├── requirements-dev.txt # Development dependencies
├── FEATURES_ROADMAP.md # Feature roadmap and status
├── CHANGELOG.md # Version history
└── README.md # Project documentation
app.pyinitializes the page and routes each Streamlit tab to its dedicated module.automl_app/coreholds reusable configuration and ML utility functions.automl_app/ui/tabskeeps each product area isolated for easier maintenance.automl_app/uicontains shared UI components used across the app.
Upload a CSV file or load a sample dataset, then configure your training:
- Select Target Column — Choose the column to predict
- Training Mode — Fast or High Accuracy mode
- Missing Value Strategy — Choose imputation method (median/mean/most_frequent/constant)
- Feature Engineering — Enable polynomial features, interactions, or aggregations
- Advanced AutoML — Enable Optuna for hyperparameter optimization
- Ensemble Models — Combine multiple models with voting/stacking
- NLP/Text — Enable TF-IDF for text classification
- Time Series — Enable ARIMA/ETS for temporal forecasting
Click Start Training to run the AutoML pipeline. View results including:
- Model leaderboard with cross-validation scores
- Feature importance and SHAP explanations
- Confusion matrices and prediction plots
- Model history and comparison dashboard
Download the trained model or export the equivalent Python code.
Explore the uploaded dataset through six comprehensive analysis tabs:
- Statistics — Skewness, kurtosis, and detailed statistical summaries
- Target Analysis — Class balance detection and distribution visualization
- Correlations — Feature correlations with heatmap visualization
- Distributions — Histograms, KDE, and Q-Q plots
- Data Quality — Completeness, uniqueness, and quality scores
- Variance Analysis — Feature variance contribution analysis
Load a previously trained model or use the current session model:
- Single Predictions — Dynamic form based on dataset schema
- Batch Predictions — Upload CSV for bulk inference
- Model Export — Download as
.ziparchive - Report Export — Generate HTML report with metrics
For technical design details, see ARCHITECTURE.md.
- Contribution workflow: CONTRIBUTING.md
- Community standards: CODE_OF_CONDUCT.md
automl · machine-learning · no-code · streamlit · scikit-learn · data-science · python · automated-machine-learning · eda · classification · regression · model-training · feature-importance · smote · gradient-boosting · shap · optuna · time-series · nlp · feature-engineering · explainable-ai · hyperparameter-optimization
Contributions are welcome. Please read CONTRIBUTING.md before opening a PR.
Quick start:
- Fork the repository.
- Create a feature branch (
git checkout -b feature/your-feature). - Commit your changes (
git commit -m "Add your feature"). - Push to the branch (
git push origin feature/your-feature). - Open a Pull Request.
When you open a PR, GitHub will auto-load the pull request template to keep reviews consistent.
Please use the GitHub issue templates for bug reports and feature requests:
- Bug report template for reproducible defects.
- Feature request template for enhancements and roadmap ideas.
You can open a new issue here: Issues.
If this project helped you, consider supporting my work!
Every contribution helps me:
- ⏰ Spend more time on open-source
- 🆓 Keep all tools free for everyone
- 📚 Create more tutorials and guides
- 🚀 Build new developer tools
⭐ Star this repo if you find it useful — it means a lot!
This project is licensed under the MIT License. See the LICENSE file for details.
| Platform | Link |
|---|---|
| GitHub | himanshu231204 |
| himanshu231204 | |
| X (Twitter) | himanshu231204 |