Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
157 changes: 157 additions & 0 deletions docs/api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
---
title: "API Overview: PyTorch-Wildlife Framework"
description: "How to use the PyTorch-Wildlife API: load detection and classification models, run single-image and batch inference, and export results, with runnable code."
tags:
- PyTorch-Wildlife API
- wildlife AI framework
- conservation deep learning framework
- MegaDetector API
- batch inference
---

# API Overview

PyTorch-Wildlife exposes a small, predictable surface. Once you know the shape of one model, you know them all: detectors and classifiers share the same single-image and batch entry points, and a common set of utilities turns their output into annotated images, crops, and JSON. This page is a guided tour of that API with code you can run, rather than a raw symbol dump. For the full catalog of loadable models, see the [Wildlife Model Zoo](model_zoo.md).

## Package layout

The framework groups everything under a few namespaces:

```python
from PytorchWildlife.models import detection as pw_detection
from PytorchWildlife.models import classification as pw_classification
from PytorchWildlife.models import bioacoustics as pw_bioacoustics
from PytorchWildlife import utils as pw_utils
```

- `models.detection`: bounding-box and point detectors (MegaDetector, Deepfaune, HerdNet).
- `models.classification`: species classifiers for crops or whole images.
- `models.bioacoustics`: audio classifiers.
- `utils`: output helpers for saving images, crops, JSON, and processing video.

## Choosing a device

Every model constructor takes a `device` argument. Use a CUDA GPU when one is available; otherwise the model runs on CPU:

```python
import torch

DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
```

If `torch.cuda.is_available()` returns `False` on a machine with an NVIDIA GPU, the [installation guide](installation.md#gpu-setup) explains how to install a CUDA-enabled PyTorch build.

## Detection

Detectors share two methods. `single_image_detection` runs one image; `batch_image_detection` runs a folder or dataloader. Both accept a `conf_thres` confidence threshold (default `0.2`).

```python
from PytorchWildlife.models import detection as pw_detection

detector = pw_detection.MegaDetectorV6(device=DEVICE, version="MDV6-yolov10-e")

# One image
result = detector.single_image_detection("path/to/image.jpg", conf_thres=0.2)

# A whole folder, batched
results = detector.batch_image_detection("path/to/folder", batch_size=16)
```

The returned dictionary carries the detections (boxes, confidences, and class IDs) alongside the image identifier. Class IDs map through `detector.CLASS_NAMES`.

## Classification

Classifiers mirror the detection interface with `single_image_classification` and `batch_image_classification`. They are most often run on the crops a detector produces, which is the standard two-stage pattern in camera-trap analysis.

```python
from PytorchWildlife.models import classification as pw_classification

classifier = pw_classification.AI4GAmazonRainforest(device=DEVICE)
prediction = classifier.single_image_classification("path/to/crop.jpg")
# prediction["prediction"] holds the species label; prediction["confidence"] the score
```

## Detect, then classify

Because both stages share a consistent API, chaining them is straightforward: detect animals, crop each box, and classify the crop.

```python
import supervision as sv
from PytorchWildlife.models import detection as pw_detection
from PytorchWildlife.models import classification as pw_classification

detector = pw_detection.MegaDetectorV6(device=DEVICE, version="MDV6-yolov10-e")
classifier = pw_classification.AI4GOpossum(device=DEVICE)

image = "path/to/image.jpg"
det = detector.single_image_detection(image)

import numpy as np
from PIL import Image
frame = np.array(Image.open(image).convert("RGB"))

for box in det["detections"].xyxy:
crop = sv.crop_image(image=frame, xyxy=box)
label = classifier.single_image_classification(crop)
print(label["prediction"], label["confidence"])
```

## Saving results

The `utils` module turns raw detections into the artifacts conservation workflows expect. These are the functions the demo scripts use:

```python
from PytorchWildlife import utils as pw_utils

# Annotated images with boxes drawn on
pw_utils.save_detection_images(results, "annotated_output", overwrite=False)

# Cropped detections, one image per animal
pw_utils.save_crop_images(results, "crop_output", overwrite=False)

# Plain JSON
pw_utils.save_detection_json(results, "results.json",
categories=detector.CLASS_NAMES)

# Timelapse-compatible JSON for ecologists' existing tooling
pw_utils.save_detection_timelapse_json(results, "results_timelapse.json",
categories=detector.CLASS_NAMES,
info={"detector": "MegaDetectorV6"})
```

For point-based detectors such as HerdNet, the dot-style variants `save_detection_images_dots` and `save_detection_json_as_dots` render and export results as points instead of boxes.

## Video

The `process_video` helper runs any per-frame callback across a video and writes an annotated copy, with a progress bar and selectable codec:

```python
from PytorchWildlife import utils as pw_utils

pw_utils.process_video(
source_path="input.mp4",
target_path="output.mp4",
callback=my_frame_callback, # takes (frame, index), returns an annotated frame
target_fps=1,
)
```

A complete video pipeline that detects and classifies every frame lives in `demo/video_demo.py`. See [inference examples](inference-examples.md) for the full walkthrough.

## Bioacoustics

Audio classification uses the same package, exposed through the `bioacoustics` namespace. The `ResNetClassifier` supports both binary and multiclass setups:

```python
from PytorchWildlife.models import bioacoustics as pw_bioacoustics

model = pw_bioacoustics.ResNetClassifier(num_classes=2)
```

The framework provides the runtime here; the trained audio models and end-to-end audio pipelines are documented at MegaDetector-Acoustic (documentation coming soon).

## Where to go next

- Browse every loadable model in the [Wildlife Model Zoo](model_zoo.md).
- Follow runnable end-to-end scripts on the [inference examples](inference-examples.md) page.
- Set up your environment with the [installation guide](installation.md).
2 changes: 1 addition & 1 deletion docs/build_mkdocs.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ The site is available at `http://127.0.0.1:8000/`.

## 4. Deploy to GitHub Pages

Push any change to `docs/**`, `mkdocs.yml`, or `docs-requirements.txt` on the `main` branch GitHub Actions deploys automatically.
Push any change to `docs/**`, `mkdocs.yml`, or `docs-requirements.txt` on the `main` branch. GitHub Actions deploys automatically.

To deploy manually:

Expand Down
81 changes: 44 additions & 37 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,79 +1,86 @@
---
description: "PyTorch-Wildlife: unified open-source AI framework from Microsoft AI for Good Lab for camera-trap detection, species classification, and wildlife monitoring."
title: "PyTorch-Wildlife: Conservation Deep Learning Framework"
description: "PyTorch-Wildlife is the open-source conservation deep learning framework and wildlife model zoo from the Microsoft AI for Good Lab. Runs MegaDetector fast."
tags:
- PyTorch-Wildlife
- wildlife AI framework
- conservation deep learning framework
- wildlife model zoo
- pytorchwildlife pip install
- MegaDetector
- wildlife AI
- camera trap detection
- species classification
- conservation AI
- Microsoft AI for Good
---

![PyTorch-Wildlife open-source AI framework for wildlife monitoring from the Microsoft AI for Good Lab](https://zenodo.org/records/15376499/files/Pytorch_Banner_transparentbk.png)
![PyTorch-Wildlife, the open-source conservation deep learning framework from the Microsoft AI for Good Lab](https://zenodo.org/records/15376499/files/Pytorch_Banner_transparentbk.png)

# PyTorch-Wildlife
# PyTorch-Wildlife: A Wildlife AI Framework

> [!TIP]
> PyTorch-Wildlife is part of the [microsoft/Biodiversity](https://github.com/microsoft/Biodiversity) umbrellathe hub for all AI for Good Lab wildlife tools. MegaDetector lives at [microsoft/MegaDetector](https://github.com/microsoft/MegaDetector).
> PyTorch-Wildlife is part of the [microsoft/Biodiversity](https://microsoft.github.io/Biodiversity/) umbrella, the hub for every AI for Good Lab wildlife tool. Looking for the camera-trap detection model on its own? See [MegaDetector](https://microsoft.github.io/MegaDetector/).

**PyTorch-Wildlife is the unified open-source AI framework from the [Microsoft AI for Good Lab](https://www.microsoft.com/en-us/ai/ai-for-good) for wildlife monitoring.** It hosts detection models, species classifiers, and the tools needed to run them — from single-image inference to large-scale batch processing across camera-trap datasets.
**PyTorch-Wildlife is the open-source conservation deep learning framework from the [Microsoft AI for Good Lab](https://www.microsoft.com/en-us/ai/ai-for-good).** One Python package gives you a tested wildlife model zoo, a consistent load-and-run API, and the data utilities that turn a folder of camera-trap images into structured detections. You write a few lines; the framework handles weight downloads, batching, and output formatting.

Our mission is to create a global community where conservation scientists can collaborate — sharing datasets and deep learning architectures for wildlife conservation. PyTorch-Wildlife provides the shared foundation that every project in our ecosystem builds on.
The goal is a shared foundation that conservation scientists can build on together: common model interfaces, reusable training and inference code, and a place to publish new architectures so the whole community benefits. Every modality-focused project in our ecosystem plugs into this framework rather than reinventing it.

## Why a framework, not just a model

A single detection model solves one problem. Real conservation pipelines need detection, classification, batch processing, video support, and exportable results that downstream tools can read. PyTorch-Wildlife packages all of that behind one import:

- **A unified model zoo.** Detection and classification models load with one line and fetch their own weights. Swap `MegaDetectorV6` for `MegaDetectorV5` or a different classifier without rewriting your pipeline.
- **A consistent inference API.** Every detector exposes `single_image_detection` and `batch_image_detection`; every classifier exposes `single_image_classification` and `batch_image_classification`. Learn it once.
- **Conservation-ready outputs.** Built-in utilities save annotated images, cropped detections, and JSON, including a Timelapse-compatible format for ecologists' existing workflows.
- **Framework support across modalities.** Vision detection, species classification, and bioacoustic classifiers all share the same package, so a multi-modal pipeline is a few imports rather than a few dependencies.

## Quick Start

Install the framework from PyPI:

```bash
pip install PytorchWildlife
```

Run detection and classification in a handful of lines. Model weights download automatically on first use:

```python
import numpy as np
from PytorchWildlife.models import detection as pw_detection
from PytorchWildlife.models import classification as pw_classification

# Detection MegaDetector V6, weights download automatically
# Detection with MegaDetector V6
detection_model = pw_detection.MegaDetectorV6()
detection_result = detection_model.single_image_detection("path/to/image.jpg")

# Classification
# Species classification
classification_model = pw_classification.AI4GAmazonRainforest()
classification_result = classification_model.single_image_classification("path/to/image.jpg")
```

**Try without installing:**
- [Hugging Face demo](https://huggingface.co/spaces/ai-for-good-lab/pytorch-wildlife) — upload images in your browser
- [Google Colab notebook](https://colab.research.google.com/drive/1rjqHrTMzEHkMualr4vB55dQWCsCKMNXi?usp=sharing) — free cloud GPU
New to the package? The [installation guide](installation.md) covers GPU setup, Docker, and Windows, and the [API overview](api.md) walks through the detection and classification interfaces with runnable examples.

**Try it without installing anything:**

## What's Inside
- [Hugging Face demo](https://huggingface.co/spaces/ai-for-good-lab/pytorch-wildlife): upload images in your browser
- [Google Colab notebook](https://colab.research.google.com/drive/1rjqHrTMzEHkMualr4vB55dQWCsCKMNXi?usp=sharing): free cloud GPU

PyTorch-Wildlife provides a modular set of building blocks:
## What's Inside

- **Detection models** — MegaDetector V5/V6 (multiple architectures), Deepfaune detector, HerdNet for aerial imagery
- **Classification models** — Amazon Rainforest, Snapshot Serengeti, Opossum, Deepfaune, DFNE (New England)
- **Bioacoustic models** — audio-based wildlife identification
- **Data utilities** — transforms, datasets, batch processing, video support
- **Demo notebooks** — Jupyter notebooks and Gradio web UI for hands-on exploration
PyTorch-Wildlife ships a modular set of building blocks:

See the [Model Zoo](model_zoo.md) for the full list with performance benchmarks.
- **Detection models.** MegaDetector V5 and V6 across several architectures, the Deepfaune detector, and HerdNet for aerial imagery. See the [Wildlife Model Zoo](model_zoo.md).
- **Classification models.** Region-specific species classifiers for the Amazon, the Serengeti, Europe, and more.
- **Bioacoustic models.** A ResNet-based audio classifier for sound-based monitoring.
- **Data and output utilities.** Transforms, datasets, batch dataloaders, video processing, and JSON exporters.
- **Demos.** Jupyter notebooks, runnable Python scripts, and a Gradio web UI. See [inference examples](inference-examples.md).

For the complete list with versions and load commands, head to the [Wildlife Model Zoo](model_zoo.md).

## Part of the Biodiversity Ecosystem
## Related Microsoft biodiversity AI projects

PyTorch-Wildlife is one project in a larger open-source ecosystem from the AI for Good Lab:
PyTorch-Wildlife is the framework layer. The modality-specific tools in the ecosystem each own their domain, and the framework provides support for running them:

| Repo | Purpose |
|---|---|
| [microsoft/Biodiversity](https://github.com/microsoft/Biodiversity) | The umbrella repository — documentation hub for the AI for Good Lab's biodiversity work |
| [microsoft/Pytorch-Wildlife](https://github.com/microsoft/Pytorch-Wildlife) | This repo — the unified deep learning framework |
| [microsoft/MegaDetector](https://github.com/microsoft/MegaDetector) | Animal detection in camera-trap imagery |
| [microsoft/SPARROW](https://github.com/microsoft/SPARROW) | Solar-Powered Acoustic and Remote Recording Observation Watch — AI-enabled edge device |
| [microsoft/MegaDetector-Acoustic](https://github.com/microsoft/MegaDetector-Acoustic) | Bioacoustic models for audio-based wildlife monitoring |
| [microsoft/MegaDetector-Classifier](https://github.com/microsoft/MegaDetector-Classifier) | Camera-trap species classification fine-tuning — adapt classifiers to your own datasets and geographic regions |
| [microsoft/MegaDetector-Overhead](https://github.com/microsoft/MegaDetector-Overhead) | Point-based detection for overhead and aerial imagery |
| [SPARROW Studio](https://github.com/microsoft/Biodiversity/tree/main/SPARROW-Studio) | Desktop application for running all models with a graphical interface |
- [microsoft/Biodiversity](https://microsoft.github.io/Biodiversity/): the umbrella hub documenting every AI for Good Lab biodiversity tool.
- [MegaDetector](https://microsoft.github.io/MegaDetector/): the camera-trap animal detection model, invoked through this framework.
- MegaDetector-Acoustic (documentation coming soon): bioacoustic models for audio-based wildlife monitoring.
- [SPARROW](https://microsoft.github.io/SPARROW/): the solar-powered edge device that runs these models in the field.

> [!TIP]
> If you have any questions, please [email us](mailto:zhongqimiao@microsoft.com) or join us on Discord: [![](https://img.shields.io/badge/any_text-Join_us!-blue?logo=discord&label=PyTorch-Wildlife)](https://discord.gg/TeEVxzaYtm)
> Questions? [Email us](mailto:zhongqimiao@microsoft.com) or join us on Discord: [![Join the PyTorch-Wildlife Discord](https://img.shields.io/badge/any_text-Join_us!-blue?logo=discord&label=PyTorch-Wildlife)](https://discord.gg/TeEVxzaYtm)
Loading