Skip to content

KaoneData/bike-share-analysis-sql

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Data Analysis Case Study: Bike Rental Operations & Growth

SQL Analysis of 15,000+ Rides to Optimize Fleet Logistics and User Strategy

Project Overview

This project provides a deep-dive analysis into a bike-sharing network's operational health and growth patterns. Using a dataset of 15,000 rides across 25 stations, I conducted a full data audit, identified peak demand bottlenecks, and mapped geographic bike migration to provide actionable business recommendations.

🛠️ Technical Stack

  • Database: MySQL 8.0
  • Advanced SQL Techniques:
    • CTEs (Common Table Expressions) for modular, readable logic.
    • Window Functions (LAG, OVER) for Month-over-Month (MoM) growth tracking.
    • Case Statements for custom user segmentation.
    • Time-Series Analysis using TIMESTAMPDIFF and DATE_FORMAT.
    • Net Flow Calculation to identify supply/demand imbalances.

Key Findings

1. The "Leisure-First" Revenue Driver

Analysis of 1,000 users revealed that Casual Riders are the primary drivers of the network's volume and bike utilization, contrary to typical commuter-heavy systems.

Segment Total Rides Avg. Distance Avg. Duration
Casual 10,676 (71%) 7.0 km 34.5 mins
Subscriber 4,324 (29%) 3.0 km 14.5 mins
  • Strategic Insight: Casual riders keep bikes in use 2.4x longer than subscribers, suggesting high demand for recreational routes.

2. Operational Peak: "The 3 PM Bottleneck"

A distribution analysis of trip start times identified a definitive peak at 15:00 (3 PM).

  • Impact: Over 1,600 rides occur in this single hour.
  • Bottleneck: With average rides lasting ~30 minutes, dock availability is at its lowest mid-afternoon, creating a "dock-poor" environment for arriving riders.

3. Geospatial "Tidal Migration" (Net Flow)

By calculating the difference between arrivals and departures, I identified a significant daily "bike migration" from residential hubs to scenic destinations.

Station Type Top Example Net Flow Action Required
Donor (Outflow) Jennifer Land St -66 Needs Morning Refills
Sink (Inflow) Amy Park St +66 Needs Afternoon Pickups

SQL Code Highlight: Growth & Retention

This snippet demonstrates the use of NULLIF to prevent division-by-zero errors when calculating growth percentages for the first month of data.

WITH monthly_signups AS (
    SELECT 
        DATE_FORMAT(created_at, '%Y-%m-01') AS signup_month,
        COUNT(user_id) AS new_user_count
    FROM users
    GROUP BY 1
)
SELECT 
    signup_month,
    new_user_count,
    ROUND((new_user_count - LAG(new_user_count) OVER (ORDER BY signup_month)) / 
    NULLIF(LAG(new_user_count) OVER (ORDER BY signup_month), 0) * 100, 2) AS mom_growth
FROM monthly_signups;

Business Recommendations

  1. Rebalancing Strategy: Deploy maintenance crews to relocate bikes from "Park" and "Summit" stations back to "Harbor" and "Manor" hubs between 6 PM and 8 PM daily.
  2. Conversion Marketing: Launch a "Frequent Rider" discount targeting Casual users who exceed 5km per trip to convert them into high-value Subscribers.
  3. Fleet Maintenance: Prioritize the Top 10 high-volume stations which handle nearly 45% of all mechanical wear-and-tear.

👤 Author

Kaone Edward

About

End-to-end SQL analysis of 15k+ rides to optimize fleet logistics and user growth strategies. Featuring CTEs, Window Functions, and Trend Analysis

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors