Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2026 Andrew Kern

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
28 changes: 26 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,38 @@ GPU-accelerated population genetics statistics using CuPy.

## Installation

pg_gpu uses [pixi](https://pixi.sh) for environment management.
Requires an NVIDIA GPU.
pg_gpu requires a Linux x86_64 machine with an NVIDIA GPU and a CUDA 12 driver.
Nothing else is needed -- the full GPU runtime, including the CUDA toolkit
headers cupy uses to JIT-compile its kernels, is pulled from PyPI via the
`cupy-cuda12x[ctk]` dependency.

### With pixi (recommended)

The pinned, reproducible environment is managed with [pixi](https://pixi.sh)
and is the recommended way to install pg_gpu:

```bash
pixi install
pixi shell
```

### Into an existing conda / venv environment

To use pg_gpu from your own workflow (Snakemake, Jupyter, an existing conda
env), install it with pip:

```bash
pip install "git+https://github.com/kr-colab/pg_gpu"
```

This pulls the full runtime stack (cupy-cuda12x with toolkit headers, bio2zarr,
kvikio, nvcomp) as declared in `pyproject.toml`. For development against a local
checkout, use an editable install:

```bash
pip install -e ".[dev]"
```

## Quick Start

```python
Expand Down
46 changes: 43 additions & 3 deletions docs/source/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,20 @@ For a high-level overview of what pg_gpu is and what it offers, see
Requirements
------------

* A CUDA 12+ capable NVIDIA GPU
* `pixi <https://pixi.sh>`_ for environment management
* A Linux x86_64 machine with a CUDA 12+ capable NVIDIA GPU
* `pixi <https://pixi.sh>`_ for the recommended environment (or ``pip``
into an environment you already manage -- see
`Installation into an Existing Environment`_)

Everything else (Python 3.12, CuPy, NumPy, SciPy, the matching CUDA
toolchain) is pinned and installed by ``pixi`` from ``pixi.lock``. We
require pixi -- not out of caprice, but because building CuPy / CUDA
recommend pixi -- not out of caprice, but because building CuPy / CUDA
extensions reproducibly is otherwise painful: pixi pulls a portable
NVIDIA toolchain into the project and removes the usual
"works-on-my-machine" tax. If you have never used pixi before, the
`installation page <https://pixi.sh/latest/#installation>`_ is a one-liner.
If you would rather not adopt pixi, pg_gpu is also a standard
pip-installable package; see `Installation into an Existing Environment`_.

Installation with Pixi
----------------------
Expand Down Expand Up @@ -75,6 +79,42 @@ environment that has both libraries installed:
See :doc:`tutorials/moments_integration` for the full
demographic-inference walk-through.

Installation into an Existing Environment
-----------------------------------------

If you already manage dependencies with conda, a virtualenv, or another
tool -- for example to call pg_gpu from a Snakemake rule or an existing
Jupyter kernel -- you can install it directly with ``pip`` instead of
adopting pixi:

.. code-block:: bash

pip install "git+https://github.com/kr-colab/pg_gpu.git"

This pulls the full runtime stack declared in ``pyproject.toml``:
``cupy-cuda12x[ctk]``, ``kvikio`` / ``nvcomp``,
``bio2zarr``, and the usual scientific-Python libraries -- all from the
default PyPI index. The only system requirement is a Linux x86_64 machine
with an NVIDIA CUDA 12 driver; no separate conda or system-wide CUDA
toolkit is needed.

For development against a local checkout, use an editable install with the
``dev`` extra:

.. code-block:: bash

git clone https://github.com/kr-colab/pg_gpu.git
cd pg_gpu
pip install -e ".[dev]"

The optional extras mirror the pixi environments: ``docs`` for the
documentation toolchain and ``moments`` for the moments LD integration
(e.g. ``pip install -e ".[dev,moments]"``).

Pixi remains the recommended, fully pinned environment (see above); the
pip path trades that reproducibility for fitting into an environment you
already control.

Running Tests
-------------

Expand Down
31 changes: 16 additions & 15 deletions pixi.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ channels = ["conda-forge", "bioconda"]
platforms = ["linux-64"]

[dependencies]
python = ">=3.12,<3.13"
python = ">=3.12"
numpy = ">=2.0"
scipy = ">=1.12"
pandas = ">=2.0"
Expand All @@ -19,23 +19,24 @@ pandoc = ">=3.9.0.2,<4"
nbsphinx = ">=0.9.8,<0.10"

[pypi-dependencies]
# The runtime stack (cupy-cuda12x, bio2zarr[vcf], and the kvikio/nvcomp
# GPU-decompression libraries) is declared in pyproject.toml's
# [project.dependencies] so that a plain `pip install` into a conda/venv
# environment pulls everything. The editable install below brings those
# transitive deps into the pixi environment too -- pyproject is the single
# source of truth. All of them (including kvikio-cu12 / nvidia-nvcomp-cu12)
# resolve from the default PyPI index now, so no extra index is needed.
pg_gpu = { path = ".", editable = true }
bio2zarr = { version = ">=0.1", extras = ["vcf"] }

# Streaming-from-zarr uses kvikio + nvCOMP to decode store chunks on the
# GPU when the codec is GPU-decodable (zstd / blosc / lz4 / deflate).
# Hard dependency: pg_gpu is GPU-first and the streaming dispatch always
# imports these unconditionally. Conda-forge's kvikio is cuda13/cp311
# only at the moment, so we go through NVIDIA's PyPI index.
"kvikio-cu12" = ">=25.0"
"nvidia-nvcomp-cu12" = ">=4.0"

[pypi-options]
extra-index-urls = ["https://pypi.nvidia.com"]

[feature.gpu.dependencies]
cupy = ">=13.0"
# cupy comes from pyproject's cupy-cuda12x pip wheel; cuda-version pins the
# conda-side CUDA runtime so any conda GPU packages stay on the 12.x ABI.
# cuda-cudart-dev supplies the toolkit headers (cuda_fp16.h, etc.): cupy's
# runtime NVRTC kernel compilation discovers the conda CUDA include dir and
# expects those headers there -- the pip wheel ships runtime libs but not the
# full toolkit headers, so without this cp.unique/sort kernels fail to build.
cuda-version = "12.*"
cuda-cudart-dev = "12.*"
notebook = ">=7.5.5,<8"

[feature.gpu.system-requirements]
Expand All @@ -59,7 +60,7 @@ sphinx = ">=4.0"
sphinx-rtd-theme = ">=1.0"

[feature.lint.dependencies]
python = ">=3.12,<3.13"
python = ">=3.12"
ruff = ">=0.4"

[environments]
Expand Down
87 changes: 85 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,101 @@
requires = ["hatchling"]
build-backend = "hatchling.build"

# Restrict the source distribution to the package and its tests. Without an
# explicit allowlist hatchling sweeps in everything not git-ignored -- stray
# PDFs, scratch data, and local .claude/ config -- which should not ship to
# PyPI. pyproject.toml, README.md, and LICENSE are always included via project
# metadata.
[tool.hatch.build.targets.sdist]
include = [
"/pg_gpu",
"/tests",
]

[project]
name = "pg_gpu"
version = "0.1.0"
description = "GPU-accelerated population genetics statistics"
readme = "README.md"
requires-python = ">=3.12"
license = "MIT"
license-files = ["LICENSE"]
authors = [
{ name = "Andrew Kern", email = "adk@uoregon.edu" },
{ name = "Andrew Kern", email = "adkern@uoregon.edu" },
]
maintainers = [
{ name = "Andrew Kern", email = "adkern@uoregon.edu" },
]
keywords = [
"population genetics",
"popgen",
"genomics",
"bioinformatics",
"GPU",
"CUDA",
"CuPy",
]
classifiers = [
"Development Status :: 3 - Alpha",
"Intended Audience :: Science/Research",
"Topic :: Scientific/Engineering :: Bio-Informatics",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Operating System :: POSIX :: Linux",
"Environment :: GPU :: NVIDIA CUDA :: 12",
]

# pg_gpu is GPU-first: cupy and the kvikio/nvcomp GPU-decompression stack are
# hard runtime requirements, not optional extras. The cupy [ctk] extra pulls
# the CUDA toolkit headers (nvrtc + runtime) from PyPI -- cupy 14 stopped
# bundling them, and they are needed at runtime because cupy JIT-compiles its
# kernels with NVRTC. With [ctk], a plain `pip install` works given only an
# NVIDIA CUDA-12 driver; no separate conda/system CUDA toolkit is required.
# matplotlib/seaborn back the plotting module (imported from __init__); tskit
# is imported directly by HaplotypeMatrix (otherwise only transitive via msprime).
dependencies = [
"numpy>=2.0",
"scipy>=1.12",
"pandas>=2.0",
"matplotlib>=3.7",
"seaborn>=0.12",
"scikit-allel>=1.3",
"msprime>=1.0",
"tskit>=0.5",
"h5py>=3.0",
"tqdm>=4.0",
"zarr>=2.16",
"bio2zarr[vcf]>=0.1",
"cupy-cuda12x[ctk]>=13.0",
"kvikio-cu12>=25.0",
"nvidia-nvcomp-cu12>=4.0",
]

[project.optional-dependencies]
dev = [
"pytest>=7.0",
"pytest-xdist>=3.0",
"ipython>=8.0",
"ipykernel>=6.0",
"ruff>=0.4",
]
docs = [
"sphinx>=4.0",
"sphinx-rtd-theme>=1.0",
"nbsphinx>=0.9.8",
]
moments = [
"moments-popgen",
"demes",
"demesdraw",
]

[project.urls]
Homepage = "https://github.com/andrewkern/pg_gpu"
Homepage = "https://github.com/kr-colab/pg_gpu"
Documentation = "https://pg-gpu.readthedocs.io"
Repository = "https://github.com/kr-colab/pg_gpu"
Issues = "https://github.com/kr-colab/pg_gpu/issues"

[tool.pytest.ini_options]
pythonpath = [
Expand Down
Loading