Skip to content

hgabrali/Advanced-Data-Wrangling-EDA-Automotive-Market-Analysis

Repository files navigation

🏆 Project Title

Advanced Data Wrangling & EDA: Automotive Market Analysis

Advanced Data Wrangling & EDA Colab Link

📄 Project Abstract (Executive Summary)

This project serves as a comprehensive data science sprint focused on cleaning, transforming, and analyzing a real-world automotive dataset containing vehicle specifications, performance metrics, and market pricing. The primary objective is to transition raw data into actionable business intelligence by strictly adhering to best-practice methodologies.

The core analysis begins by rigorously addressing data quality issues, including handling missing values, correcting critical data types (ensuring 'Year' and performance metrics are numeric), and standardizing text features. This phase is followed by Feature Engineering, where key new metrics like Total MPG and the efficiency ratio Price per HP are calculated to enrich the dataset.

The subsequent Exploratory Data Analysis (EDA) phase focuses on uncovering structural insights: investigating the distribution of fuel efficiency (city mpg), analyzing pricing trends across market segments (Vehicle Size), and determining the correlation between Engine HP, MSRP, and Popularity. Ultimately, the project provides clear answers on drivetrain types impact price and how transmission types influence fuel economy trends, delivering a clean, fully-vetted dataset ready for advanced machine learning model building.


Project Goal:

To clean, transform, and analyze automotive data to identify structural pricing trends, performance relationships (Engine HP vs. MSRP), and market-segment-specific fuel efficiency patterns, thereby supporting informed business decisions in market positioning and product strategy.

📊 Key Deliverables Vetted in This Project:

  1. A fully cleaned and filtered dataset (Year 1995 and later) with no missing data and standardized data types.
  2. Two engineered features (Total MPG, Price per HP) for enhanced analysis.
  3. Five core visualizations detailing distribution, central tendency, and bivariate relationships.
  4. Correlation Analysis between key variables (Engine HP, MSRP, Popularity, MPG).
  5. A written summary outlining key trends discovered (e.g., pricing trends relative to market size, impact of Engine HP on MSRP).

About

MasterSchool_Advanced Data Wrangling & EDA: Automotive Market Analysis

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published