| Team Member | Unity ID |
|---|---|
| Apurv Choudhari | apchoudh |
| Chinmay Singhania | csingha |
| Parth Kulkarni | pnkulka2 |
| Madhur Dixit | mvdixit |
This project predicts soccer match outcomes using different datasets and methodologies. Below is an overview of the files in this repository:
-
Match_Outcome_Prediction_Manual_Feature_Engineering.ipynb (Main File)
- Implements manual feature engineering techniques for predictive modeling.
- Uses
database.sqlitedataset.
-
Match_Outcome_Prediction_Featuretools.ipynb
- Employs automated feature engineering using Featuretools.
- Also uses the
database.sqlitedataset.
-
Match_Outcome_Prediction_New_Dataset.ipynb
- Works with a simpler dataset derived from 10 CSV files, containing data for seasons 2015-16 to 2024-25.
- Combines these files into a single dataset for predictions.
-
README.md
- Documentation for the project.
Our combined data (Contains data from both the sources): Combined Data
-
Kaggle Dataset: European Soccer Database
- Format:
database.sqlite - Used in:
Match_Outcome_Prediction_Manual_Feature_Engineering.ipynbandMatch_Outcome_Prediction_Featuretools.ipynb.
- Format:
-
Football Data UK: England Matches Dataset
- Format: 10 CSV files (2015-16 to 2024-25 seasons).
- Used in:
Match_Outcome_Prediction_New_Dataset.ipynb. - Text file explaining the meanings of the columns for the dataset. Text file with column meanings
-
Manual vs. Automated Feature Engineering:
Manual_Feature_Engineering.ipynbapplies manual techniques.Featuretools.ipynbuses automated feature extraction.
-
Datasets:
database.sqlitecontains a structured database of European soccer matches.- The CSV dataset combines English Premier League match results for 10 seasons.
-
Clone the repository.
git clone <repository-url> cd soccer-match-outcome-predictions
-
Install dependencies.
pip install -r requirements.txt
-
Ensure the datasets are placed in the correct directories:
- Download the data files from Combined Data.
- Place
database.sqlitein the root folder. - Combine the 10 CSV files into one file (
combined_dataset.csv) usingMatch_Outcome_Prediction_New_Dataset.ipynb.
-
Launch Jupyter Notebook.
jupyter notebook
-
Open the main file:
Match_Outcome_Prediction_Manual_Feature_Engineering.ipynb.- Follow the steps in the notebook to preprocess data, train models, and evaluate predictions.
-
Optionally, explore other notebooks:
Match_Outcome_Prediction_Featuretools.ipynbfor automated feature engineering.Match_Outcome_Prediction_New_Dataset.ipynbfor working with the simpler dataset.
- The notebooks are designed for educational purposes and might require GPU resources for larger datasets.
- Make sure to have Python 3.8 or later installed.