A Machine Learning project that predicts student academic performance using academic, demographic, and behavioral attributes. The goal is to identify at-risk students early and support educators in making data-driven academic decisions.
Student performance depends on multiple factors such as study habits, attendance, previous grades, and demographic background. Educational institutions often struggle to detect struggling students early.
This project uses Machine Learning models to analyze student data and predict academic performance so that timely interventions can be made.
The dataset contains 1000+ student records with multiple attributes affecting academic performance.
- Study Hours
- Attendance
- Previous Grades
- Demographic Information
- Behavioral Attributes
These features help the model learn patterns that influence student academic outcomes.
The project follows a standard Machine Learning workflow:
- Imported dataset containing student academic records.
- Data cleaning
- Handling missing values
- Feature selection
- Data transformation
EDA was performed using:
- Pandas
- NumPy
- Matplotlib
- Seaborn
This helped identify relationships between study habits, attendance, and exam performance.
Created relevant variables to improve model learning and prediction capability.
Implemented Machine Learning models using Scikit-Learn:
- Linear Regression
- Random Forest
Model performance was evaluated using:
- R² Score
- Mean Absolute Error (MAE)
- Root Mean Square Error (RMSE)
| Category | Tools |
|---|---|
| Programming Language | Python |
| Data Analysis | Pandas, NumPy |
| Data Visualization | Matplotlib, Seaborn |
| Machine Learning | Scikit-learn |
| Environment | Jupyter Notebook |
Key insights from the analysis:
- Students with higher attendance and study hours tend to perform better academically.
- Behavioral and demographic factors influence student performance patterns.
- Random Forest model achieved better predictive performance compared to baseline models.
Possible enhancements for the project:
- Implement advanced models like XGBoost and Gradient Boosting
- Deploy the model as a web application
- Create a dashboard for educators
- Integrate real-time academic data
Student-Academic-Performance-Prediction
│
├── dataset
│ └── student_data.csv
│
├── notebooks
│ └── analysis.ipynb
│
├── models
│ └── trained_model.pkl
│
├── README.md
└── requirements.txt
Neha Shit Computer Science Engineering Student
GitHub: https://github.com/Neha501 LinkedIn: https://www.linkedin.com/in/neha-shit
⭐ If you found this project useful, consider giving it a star on GitHub.