Skip to content

Conversation

@samuelwnaylor
Copy link
Collaborator

@samuelwnaylor samuelwnaylor commented Dec 19, 2025

This PR migrates from pandas to polars for the Pre-Post analysis.

The speed improvements are as follows...

Overall Speed Improvements Reviewing GH Actions Times

A measure of the speed improvement is evident in the time the GitHub Actions take to run; this PR reduces the time taken by ~20%, reducing the old time taken of ~17.5 minutes down to ~14 minutes. Looking at the GitHub Action named tests that is run with python 3.13, this PR is ~25% faster.

Benchmark Results in the Test Suite

Benchmarking of the _pre_post_pp_analysis_with_reversal functions (both pandas and polars) is added to the test suite in this PR. The result when running this locally is ~35% faster than the previous pandas code.

Name Min (ms) Max (ms) Mean (ms) StdDev (ms) Median (ms) IQR (ms) Outliers OPS Rounds Iterations
test_polars 72.33 (1.0×) 76.64 (1.0×) 74.37 (1.0×) 1.28 (1.0×) 74.01 (1.0×) 1.40 (1.0×) 3;0 13.45 (1.0×) 13 1
test_pandas 97.56 (1.35×) 102.10 (1.33×) 100.29 (1.35×) 1.42 (1.11×) 100.08 (1.35×) 1.88 (1.34×) 3;0 9.97 (0.74×) 10 1

Summary: Polars implementation is ~35% faster than pandas (median: 74.01ms vs 100.08ms)

The function tested in the test suite demonstrates it is ~30% faster
than the previous pandas implementation. This function is called many
times in `pre_post_pp_analysis_with_reversal_and_bootstrapping` so
performance improvement should be significant.
@samuelwnaylor samuelwnaylor marked this pull request as ready for review December 19, 2025 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants