From a50ae3cd9a4589f64d07c0dfe6c06f6aa3ca0e46 Mon Sep 17 00:00:00 2001 From: rain-Brian Date: Wed, 3 Jun 2026 13:22:53 -0700 Subject: [PATCH 1/2] docs(seo): PyTorch-Wildlife content depth + long-tail (framework/model-zoo lane) Deepen homepage, installation, and model zoo and add API overview + inference examples pages. Standardize prose on 'conservation deep learning framework'; add keyword-first front-matter, model-zoo H1, cross-links to ecosystem siblings as modality owners, and a hub backlink. All examples sourced from real package API and demo scripts. --- docs/api.md | 157 +++++++++++++++++++++++++++++++++++++ docs/build_mkdocs.md | 2 +- docs/index.md | 81 ++++++++++--------- docs/inference-examples.md | 120 ++++++++++++++++++++++++++++ docs/installation.md | 69 +++++++++------- docs/model_zoo.md | 64 +++++++++------ mkdocs.yml | 2 + 7 files changed, 402 insertions(+), 93 deletions(-) create mode 100644 docs/api.md create mode 100644 docs/inference-examples.md diff --git a/docs/api.md b/docs/api.md new file mode 100644 index 0000000..1088028 --- /dev/null +++ b/docs/api.md @@ -0,0 +1,157 @@ +--- +title: "API Overview: PyTorch-Wildlife Framework" +description: "How to use the PyTorch-Wildlife API: load detection and classification models, run single-image and batch inference, and export results, with runnable code." +tags: + - PyTorch-Wildlife API + - wildlife AI framework + - conservation deep learning framework + - MegaDetector API + - batch inference +--- + +# API Overview + +PyTorch-Wildlife exposes a small, predictable surface. Once you know the shape of one model, you know them all: detectors and classifiers share the same single-image and batch entry points, and a common set of utilities turns their output into annotated images, crops, and JSON. This page is a guided tour of that API with code you can run, rather than a raw symbol dump. For the full catalog of loadable models, see the [Wildlife Model Zoo](model_zoo.md). + +## Package layout + +The framework groups everything under a few namespaces: + +```python +from PytorchWildlife.models import detection as pw_detection +from PytorchWildlife.models import classification as pw_classification +from PytorchWildlife.models import bioacoustics as pw_bioacoustics +from PytorchWildlife import utils as pw_utils +``` + +- `models.detection`: bounding-box and point detectors (MegaDetector, Deepfaune, HerdNet). +- `models.classification`: species classifiers for crops or whole images. +- `models.bioacoustics`: audio classifiers. +- `utils`: output helpers for saving images, crops, JSON, and processing video. + +## Choosing a device + +Every model constructor takes a `device` argument. Use a CUDA GPU when one is available; otherwise the model runs on CPU: + +```python +import torch + +DEVICE = "cuda" if torch.cuda.is_available() else "cpu" +``` + +If `torch.cuda.is_available()` returns `False` on a machine with an NVIDIA GPU, the [installation guide](installation.md#gpu-setup) explains how to install a CUDA-enabled PyTorch build. + +## Detection + +Detectors share two methods. `single_image_detection` runs one image; `batch_image_detection` runs a folder or dataloader. Both accept a `conf_thres` confidence threshold (default `0.2`). + +```python +from PytorchWildlife.models import detection as pw_detection + +detector = pw_detection.MegaDetectorV6(device=DEVICE, version="MDV6-yolov10-e") + +# One image +result = detector.single_image_detection("path/to/image.jpg", conf_thres=0.2) + +# A whole folder, batched +results = detector.batch_image_detection("path/to/folder", batch_size=16) +``` + +The returned dictionary carries the detections (boxes, confidences, and class IDs) alongside the image identifier. Class IDs map through `detector.CLASS_NAMES`. + +## Classification + +Classifiers mirror the detection interface with `single_image_classification` and `batch_image_classification`. They are most often run on the crops a detector produces, which is the standard two-stage pattern in camera-trap analysis. + +```python +from PytorchWildlife.models import classification as pw_classification + +classifier = pw_classification.AI4GAmazonRainforest(device=DEVICE) +prediction = classifier.single_image_classification("path/to/crop.jpg") +# prediction["prediction"] holds the species label; prediction["confidence"] the score +``` + +## Detect, then classify + +Because both stages share a consistent API, chaining them is straightforward: detect animals, crop each box, and classify the crop. + +```python +import supervision as sv +from PytorchWildlife.models import detection as pw_detection +from PytorchWildlife.models import classification as pw_classification + +detector = pw_detection.MegaDetectorV6(device=DEVICE, version="MDV6-yolov10-e") +classifier = pw_classification.AI4GOpossum(device=DEVICE) + +image = "path/to/image.jpg" +det = detector.single_image_detection(image) + +import numpy as np +from PIL import Image +frame = np.array(Image.open(image).convert("RGB")) + +for box in det["detections"].xyxy: + crop = sv.crop_image(image=frame, xyxy=box) + label = classifier.single_image_classification(crop) + print(label["prediction"], label["confidence"]) +``` + +## Saving results + +The `utils` module turns raw detections into the artifacts conservation workflows expect. These are the functions the demo scripts use: + +```python +from PytorchWildlife import utils as pw_utils + +# Annotated images with boxes drawn on +pw_utils.save_detection_images(results, "annotated_output", overwrite=False) + +# Cropped detections, one image per animal +pw_utils.save_crop_images(results, "crop_output", overwrite=False) + +# Plain JSON +pw_utils.save_detection_json(results, "results.json", + categories=detector.CLASS_NAMES) + +# Timelapse-compatible JSON for ecologists' existing tooling +pw_utils.save_detection_timelapse_json(results, "results_timelapse.json", + categories=detector.CLASS_NAMES, + info={"detector": "MegaDetectorV6"}) +``` + +For point-based detectors such as HerdNet, the dot-style variants `save_detection_images_dots` and `save_detection_json_as_dots` render and export results as points instead of boxes. + +## Video + +The `process_video` helper runs any per-frame callback across a video and writes an annotated copy, with a progress bar and selectable codec: + +```python +from PytorchWildlife import utils as pw_utils + +pw_utils.process_video( + source_path="input.mp4", + target_path="output.mp4", + callback=my_frame_callback, # takes (frame, index), returns an annotated frame + target_fps=1, +) +``` + +A complete video pipeline that detects and classifies every frame lives in `demo/video_demo.py`. See [inference examples](inference-examples.md) for the full walkthrough. + +## Bioacoustics + +Audio classification uses the same package, exposed through the `bioacoustics` namespace. The `ResNetClassifier` supports both binary and multiclass setups: + +```python +from PytorchWildlife.models import bioacoustics as pw_bioacoustics + +model = pw_bioacoustics.ResNetClassifier(num_classes=2) +``` + +The framework provides the runtime here; the trained audio models and end-to-end audio pipelines are documented at [MegaDetector-Acoustic](https://microsoft.github.io/MegaDetector-Acoustic/). + +## Where to go next + +- Browse every loadable model in the [Wildlife Model Zoo](model_zoo.md). +- Follow runnable end-to-end scripts on the [inference examples](inference-examples.md) page. +- Set up your environment with the [installation guide](installation.md). diff --git a/docs/build_mkdocs.md b/docs/build_mkdocs.md index 906578b..aa93343 100644 --- a/docs/build_mkdocs.md +++ b/docs/build_mkdocs.md @@ -57,7 +57,7 @@ The site is available at `http://127.0.0.1:8000/`. ## 4. Deploy to GitHub Pages -Push any change to `docs/**`, `mkdocs.yml`, or `docs-requirements.txt` on the `main` branch — GitHub Actions deploys automatically. +Push any change to `docs/**`, `mkdocs.yml`, or `docs-requirements.txt` on the `main` branch. GitHub Actions deploys automatically. To deploy manually: diff --git a/docs/index.md b/docs/index.md index 81acbd6..f867c7e 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,79 +1,86 @@ --- -description: "PyTorch-Wildlife: unified open-source AI framework from Microsoft AI for Good Lab for camera-trap detection, species classification, and wildlife monitoring." +title: "PyTorch-Wildlife: Conservation Deep Learning Framework" +description: "PyTorch-Wildlife is the open-source conservation deep learning framework and wildlife model zoo from the Microsoft AI for Good Lab. Runs MegaDetector fast." tags: - PyTorch-Wildlife + - wildlife AI framework + - conservation deep learning framework + - wildlife model zoo + - pytorchwildlife pip install - MegaDetector - - wildlife AI - - camera trap detection - species classification - - conservation AI - - Microsoft AI for Good --- -![PyTorch-Wildlife — open-source AI framework for wildlife monitoring from the Microsoft AI for Good Lab](https://zenodo.org/records/15376499/files/Pytorch_Banner_transparentbk.png) +![PyTorch-Wildlife, the open-source conservation deep learning framework from the Microsoft AI for Good Lab](https://zenodo.org/records/15376499/files/Pytorch_Banner_transparentbk.png) -# PyTorch-Wildlife +# PyTorch-Wildlife: A Wildlife AI Framework > [!TIP] -> PyTorch-Wildlife is part of the [microsoft/Biodiversity](https://github.com/microsoft/Biodiversity) umbrella — the hub for all AI for Good Lab wildlife tools. MegaDetector lives at [microsoft/MegaDetector](https://github.com/microsoft/MegaDetector). +> PyTorch-Wildlife is part of the [microsoft/Biodiversity](https://microsoft.github.io/Biodiversity/) umbrella, the hub for every AI for Good Lab wildlife tool. Looking for the camera-trap detection model on its own? See [MegaDetector](https://microsoft.github.io/MegaDetector/). -**PyTorch-Wildlife is the unified open-source AI framework from the [Microsoft AI for Good Lab](https://www.microsoft.com/en-us/ai/ai-for-good) for wildlife monitoring.** It hosts detection models, species classifiers, and the tools needed to run them — from single-image inference to large-scale batch processing across camera-trap datasets. +**PyTorch-Wildlife is the open-source conservation deep learning framework from the [Microsoft AI for Good Lab](https://www.microsoft.com/en-us/ai/ai-for-good).** One Python package gives you a tested wildlife model zoo, a consistent load-and-run API, and the data utilities that turn a folder of camera-trap images into structured detections. You write a few lines; the framework handles weight downloads, batching, and output formatting. -Our mission is to create a global community where conservation scientists can collaborate — sharing datasets and deep learning architectures for wildlife conservation. PyTorch-Wildlife provides the shared foundation that every project in our ecosystem builds on. +The goal is a shared foundation that conservation scientists can build on together: common model interfaces, reusable training and inference code, and a place to publish new architectures so the whole community benefits. Every modality-focused project in our ecosystem plugs into this framework rather than reinventing it. +## Why a framework, not just a model + +A single detection model solves one problem. Real conservation pipelines need detection, classification, batch processing, video support, and exportable results that downstream tools can read. PyTorch-Wildlife packages all of that behind one import: + +- **A unified model zoo.** Detection and classification models load with one line and fetch their own weights. Swap `MegaDetectorV6` for `MegaDetectorV5` or a different classifier without rewriting your pipeline. +- **A consistent inference API.** Every detector exposes `single_image_detection` and `batch_image_detection`; every classifier exposes `single_image_classification` and `batch_image_classification`. Learn it once. +- **Conservation-ready outputs.** Built-in utilities save annotated images, cropped detections, and JSON, including a Timelapse-compatible format for ecologists' existing workflows. +- **Framework support across modalities.** Vision detection, species classification, and bioacoustic classifiers all share the same package, so a multi-modal pipeline is a few imports rather than a few dependencies. ## Quick Start +Install the framework from PyPI: + ```bash pip install PytorchWildlife ``` +Run detection and classification in a handful of lines. Model weights download automatically on first use: + ```python -import numpy as np from PytorchWildlife.models import detection as pw_detection from PytorchWildlife.models import classification as pw_classification -# Detection — MegaDetector V6, weights download automatically +# Detection with MegaDetector V6 detection_model = pw_detection.MegaDetectorV6() detection_result = detection_model.single_image_detection("path/to/image.jpg") -# Classification +# Species classification classification_model = pw_classification.AI4GAmazonRainforest() classification_result = classification_model.single_image_classification("path/to/image.jpg") ``` -**Try without installing:** -- [Hugging Face demo](https://huggingface.co/spaces/ai-for-good-lab/pytorch-wildlife) — upload images in your browser -- [Google Colab notebook](https://colab.research.google.com/drive/1rjqHrTMzEHkMualr4vB55dQWCsCKMNXi?usp=sharing) — free cloud GPU +New to the package? The [installation guide](installation.md) covers GPU setup, Docker, and Windows, and the [API overview](api.md) walks through the detection and classification interfaces with runnable examples. +**Try it without installing anything:** -## What's Inside +- [Hugging Face demo](https://huggingface.co/spaces/ai-for-good-lab/pytorch-wildlife): upload images in your browser +- [Google Colab notebook](https://colab.research.google.com/drive/1rjqHrTMzEHkMualr4vB55dQWCsCKMNXi?usp=sharing): free cloud GPU -PyTorch-Wildlife provides a modular set of building blocks: +## What's Inside -- **Detection models** — MegaDetector V5/V6 (multiple architectures), Deepfaune detector, HerdNet for aerial imagery -- **Classification models** — Amazon Rainforest, Snapshot Serengeti, Opossum, Deepfaune, DFNE (New England) -- **Bioacoustic models** — audio-based wildlife identification -- **Data utilities** — transforms, datasets, batch processing, video support -- **Demo notebooks** — Jupyter notebooks and Gradio web UI for hands-on exploration +PyTorch-Wildlife ships a modular set of building blocks: -See the [Model Zoo](model_zoo.md) for the full list with performance benchmarks. +- **Detection models.** MegaDetector V5 and V6 across several architectures, the Deepfaune detector, and HerdNet for aerial imagery. See the [Wildlife Model Zoo](model_zoo.md). +- **Classification models.** Region-specific species classifiers for the Amazon, the Serengeti, Europe, and more. +- **Bioacoustic models.** A ResNet-based audio classifier for sound-based monitoring. +- **Data and output utilities.** Transforms, datasets, batch dataloaders, video processing, and JSON exporters. +- **Demos.** Jupyter notebooks, runnable Python scripts, and a Gradio web UI. See [inference examples](inference-examples.md). +For the complete list with versions and load commands, head to the [Wildlife Model Zoo](model_zoo.md). -## Part of the Biodiversity Ecosystem +## Related Microsoft biodiversity AI projects -PyTorch-Wildlife is one project in a larger open-source ecosystem from the AI for Good Lab: +PyTorch-Wildlife is the framework layer. The modality-specific tools in the ecosystem each own their domain, and the framework provides support for running them: -| Repo | Purpose | -|---|---| -| [microsoft/Biodiversity](https://github.com/microsoft/Biodiversity) | The umbrella repository — documentation hub for the AI for Good Lab's biodiversity work | -| [microsoft/Pytorch-Wildlife](https://github.com/microsoft/Pytorch-Wildlife) | This repo — the unified deep learning framework | -| [microsoft/MegaDetector](https://github.com/microsoft/MegaDetector) | Animal detection in camera-trap imagery | -| [microsoft/SPARROW](https://github.com/microsoft/SPARROW) | Solar-Powered Acoustic and Remote Recording Observation Watch — AI-enabled edge device | -| [microsoft/MegaDetector-Acoustic](https://github.com/microsoft/MegaDetector-Acoustic) | Bioacoustic models for audio-based wildlife monitoring | -| [microsoft/MegaDetector-Classifier](https://github.com/microsoft/MegaDetector-Classifier) | Camera-trap species classification fine-tuning — adapt classifiers to your own datasets and geographic regions | -| [microsoft/MegaDetector-Overhead](https://github.com/microsoft/MegaDetector-Overhead) | Point-based detection for overhead and aerial imagery | -| [SPARROW Studio](https://github.com/microsoft/Biodiversity/tree/main/SPARROW-Studio) | Desktop application for running all models with a graphical interface | +- [microsoft/Biodiversity](https://microsoft.github.io/Biodiversity/): the umbrella hub documenting every AI for Good Lab biodiversity tool. +- [MegaDetector](https://microsoft.github.io/MegaDetector/): the camera-trap animal detection model, invoked through this framework. +- [MegaDetector-Acoustic](https://microsoft.github.io/MegaDetector-Acoustic/): bioacoustic models for audio-based wildlife monitoring. +- [SPARROW](https://microsoft.github.io/SPARROW/): the solar-powered edge device that runs these models in the field. > [!TIP] -> If you have any questions, please [email us](mailto:zhongqimiao@microsoft.com) or join us on Discord: [![](https://img.shields.io/badge/any_text-Join_us!-blue?logo=discord&label=PyTorch-Wildlife)](https://discord.gg/TeEVxzaYtm) +> Questions? [Email us](mailto:zhongqimiao@microsoft.com) or join us on Discord: [![Join the PyTorch-Wildlife Discord](https://img.shields.io/badge/any_text-Join_us!-blue?logo=discord&label=PyTorch-Wildlife)](https://discord.gg/TeEVxzaYtm) diff --git a/docs/inference-examples.md b/docs/inference-examples.md new file mode 100644 index 0000000..d1e8665 --- /dev/null +++ b/docs/inference-examples.md @@ -0,0 +1,120 @@ +--- +title: "Inference Examples: Run Wildlife Models at Scale" +description: "Runnable PyTorch-Wildlife inference examples: single image, batch folder, video, and the Gradio web UI. Export annotated images, crops, and Timelapse JSON." +tags: + - PyTorch-Wildlife inference + - batch image detection + - wildlife model zoo + - conservation deep learning framework + - MegaDetector batch +--- + +# Inference Examples + +PyTorch-Wildlife ships runnable demo scripts so you can go from a fresh install to real output without writing your own harness first. This page walks through the common ways to run inference: one image, a whole folder, a video, and an interactive web UI. Each example maps to a script under the repository's `demo/` directory. + +Before you start, install the framework with the [installation guide](installation.md) and skim the [API overview](api.md) so the method names below feel familiar. + +> [!TIP] +> The demo scripts are written in cell style (`#%%`), so they run top to bottom as a plain `python demo/