Skip to content

Feature/asv workflow#41

Open
prasad-sawantdesai wants to merge 38 commits intoiterorganization:developfrom
prasad-sawantdesai:feature/asv_workflow
Open

Feature/asv workflow#41
prasad-sawantdesai wants to merge 38 commits intoiterorganization:developfrom
prasad-sawantdesai:feature/asv_workflow

Conversation

@prasad-sawantdesai
Copy link
Contributor

@prasad-sawantdesai prasad-sawantdesai commented Feb 24, 2026

GitHub Actions workflow (asv.yml) added to run ASV performance benchmarks on public test data

run_benchmarks.sh (Bamboo) refactored: adds a proper upstream remote pointing to iterorganization/IBEX, fetches develop/main from there, and returns to the triggering branch before running benchmarks.

configure_env.sh updated for modules

run_pytest.sh adds pytest-xdist and fixes -n=auto → -n auto for parallel test execution.

build_docs.sh simplifies by replacing explicit package installs with pip install -r ../docs/requirements.txt.

pyproject.toml fixes license metadata for setuptools <77 compatibility,
pins sphinx>=7,<9 and sphinx-autosummary-accessors==2025.3.1 for reproducible doc builds

Both, ITER CI plan on bamboo and Github workflow are up and running..

@maarten-ic
Copy link

Hi @prasad-sawantdesai,

Please note that github runners are Virtual Machines and the underlying hardware can change widely between different runs. Benchmark results of different runs are therefore incomparable: it may have run on faster or slower hardware, or on higher/lower loaded physical machines but you cannot know that. Github Actions is therefore not really suitable for tracking the software performance over time.

Doing performance comparisons against main or develop with PRs could be done, but then there should also be a process for following up on the results (e.g. fail the CI when performance has degraded "too much" -- and you'll need to think what threshold you'd find acceptable). I doubt that anyone would notice a performance regression if the benchmark workflow is always green 😉

Hope this helps!
Maarten

…egression gating.Replace ITER-internal URIs with public Zenodo datasets, use asv continuous for same-runner comparisons, fail CI on >2x regression, post PR comments, and apply CI-specific threshold at runtime.
@prasad-sawantdesai
Copy link
Contributor Author

Please note that github runners are Virtual Machines and the underlying hardware can change widely between different runs. Benchmark results of different runs are therefore incomparable: it may have run on faster or slower hardware, or on higher/lower loaded physical machines but you cannot know that. Github Actions is therefore not really suitable for tracking the software performance over time.

Doing performance comparisons against main or develop with PRs could be done, but then there should also be a process for following up on the results (e.g. fail the CI when performance has degraded "too much" -- and you'll need to think what threshold you'd find acceptable). I doubt that anyone would notice a performance regression if the benchmark workflow is always green 😉

Hope this helps! Maarten

Thanks Maarten. You are right about GitHub runners, we will also check from IT if we can setup few self hosted runners for these specific cases. Now instead of cross-run comparisons we are using asv continuous which runs both commits back-to-back on the same runner, so hardware is identical within a single comparison. Here added factor 2.0 makes asv continuous fails if any benchmark is >2x slower.

Lets see how it goes.. Meanwhile we will also keep running ITER CI to run test cases in MDSPLUS and HDF5 backend.

Copy link

@deepakmaroo deepakmaroo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants