Tennis Point Leverage Analysis

A machine learning approach to quantifying point-level pressure in professional tennis. This project uses LSTM neural networks trained on 170,000 service games to measure how players perform under pressure compared to mathematical expectation.

Overview

Tennis scoring creates unequal point importance. While conventional wisdom suggests players "choke" under pressure, the data reveals a more nuanced story: players significantly outperform mathematical expectation when facing break points but underperform when ahead.

This project compares empirical hold probabilities (learned by an LSTM) against theoretical baselines (computed from independence assumptions) to isolate psychological effects.

Key Findings

Players outperform expectation by 2-5% when under pressure (facing break points)
Players underperform by 1-5% when comfortable (holding serve comfortably)
The first point of a service game has disproportionate impact: +25.6% hold rate difference
30-30 is the highest-leverage non-terminal point with 43% swing in hold probability
Small but measurable momentum effect exists: +2.4% win probability after winning previous point

Results

Point Leverage

Point leverage measures the swing in hold probability from winning vs. losing a point. Higher leverage indicates more critical points.

Clutch vs. Choke

Performance deviation compares empirical hold probabilities against theoretical expectation. Positive values indicate clutch performance, negative values indicate underperformance.

Match Prediction

The model provides real-time hold probability predictions as a match progresses, capturing momentum and psychological factors.

Installation

Requirements

Python 3.8 or higher
pip

Setup

Clone the repository:

git clone https://github.com/elementnl/tennis-point-model.git
cd tennis-point-model

Create and activate a virtual environment:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Usage

Training the Model

python src/train.py

Analyzing Point Leverage

python src/importance.py

Processing Match Data

python src/data.py

Model Architecture

2-layer LSTM with 32 hidden dimensions
Input: Point-by-point score sequences (server points, receiver points, point outcome)
Output: Probability that server holds the game
Training: 135,876 games, Validation: 33,970 games
Final validation accuracy: 99.2%
Total parameters: ~6,000

The model is intentionally small to learn generalizable patterns rather than memorize specific matches.

Data Source

This project uses data from the Tennis Abstract Match Charting Project by Jeff Sackmann. The dataset contains point-by-point data for professional tennis matches.

Tools and Technologies

PyTorch: Deep learning framework for LSTM implementation
NumPy: Numerical computation
Pandas: Data processing and analysis
Matplotlib: Visualization
Scikit-learn: Model evaluation metrics
Jupyter: Interactive development and analysis

Project Structure

tennis-point-model/
├── src/
│   ├── data.py          # Data loading and preprocessing
│   ├── dataset.py       # PyTorch dataset implementation
│   ├── model.py         # LSTM model architecture
│   ├── train.py         # Training loop
│   ├── game_model.py    # Game probability calculations
│   └── importance.py    # Point leverage analysis
├── figures/             # Generated visualizations
├── models/              # Saved model checkpoints
├── data/                # Raw data files
└── requirements.txt     # Python dependencies

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use this code or methodology in your research, please cite:

@misc{tennis-point-leverage,
  author = {Varun},
  title = {Quantifying Point Leverage in Tennis: An LSTM Approach},
  year = {2024},
  publisher = {GitHub},
  url = {https://github.com/elementnl/tennis-point-model}
}

Acknowledgments

Data provided by Jeff Sackmann's Tennis Abstract Match Charting Project.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docs		docs
figures		figures
models		models
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tennis Point Leverage Analysis

Overview

Key Findings

Results

Point Leverage

Clutch vs. Choke

Match Prediction

Installation

Requirements

Setup

Usage

Training the Model

Analyzing Point Leverage

Processing Match Data

Model Architecture

Data Source

Tools and Technologies

Project Structure

License

Citation

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Tennis Point Leverage Analysis

Overview

Key Findings

Results

Point Leverage

Clutch vs. Choke

Match Prediction

Installation

Requirements

Setup

Usage

Training the Model

Analyzing Point Leverage

Processing Match Data

Model Architecture

Data Source

Tools and Technologies

Project Structure

License

Citation

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages