This toolbox is a modular framework designed to facilitate the implementation and evaluation of active learning (AL) workflows in PyTorch. It includes implementations for the following publications:
Setting up the DAL-Toolbox is straightforward. Clone the repository and execute the following commands:
conda create -n dal-toolbox python=3.12
pip install -e .Afterward, install additional packages as required for your task. The implementations in the publication directory typically require additional dependencies, which are aggregated into different requirements.txt files.
The following snippet demonstrates a basic AL cycle on a two-dimensional toy dataset:
import torch
import lightning as L
from sklearn.datasets import make_moons
from torch.utils.data import TensorDataset
from dal_toolbox.active_learning import ActiveLearningDataModule
from dal_toolbox.active_learning.strategies import LeastConfidentSampling
from dal_toolbox.models.deterministic import DeterministicModel
from dal_toolbox.models.deterministic.simplenet import SimpleNet
# 1. Create the two moons dataset
X, y = make_moons(n_samples=200, noise=.1, random_state=42)
dataset = TensorDataset(torch.tensor(X).float(), torch.tensor(y).long())
# 2. Setup the AL Data Module with 2 initial randomly labeled samples
al_datamodule = ActiveLearningDataModule(dataset, train_batch_size=32)
al_datamodule.random_init(n_samples=2, class_balanced=True)
# 3. Initialize the Model and Strategy
strategy = LeastConfidentSampling()
model = SimpleNet(dropout_rate=0., num_classes=2)
model = DeterministicModel(
model,
optimizer=torch.optim.SGD(model.parameters(), lr=1e-1, momentum=.9)
)
# 4. Perform Active Learning Cycles
for cycle in range(4):
# Query and update annotations (skip for the initial cycle)
if cycle != 0:
indices = strategy.query(model=model, al_datamodule=al_datamodule, acq_size=2)
al_datamodule.update_annotations(indices)
# Train the model
model.reset_states()
trainer = L.Trainer(max_epochs=50, enable_progress_bar=False)
trainer.fit(model, al_datamodule)Note: While this example uses PyTorch Lightning for convenience, it is not strictly required for most strategies. You can easily replace the L.Trainer with a standard PyTorch training function.
Check out tbd and the ./publications directory for more sophisticated implementations.
If you find this toolbox useful for your research, please consider citing us.
@inproceedings{huseljic2026refine,
title = {Cleaning the {Pool}: {Progressive} {Filtering} of {Unlabeled} {Pools} in {Deep} {Active} {Learning}},
shorttitle = {Cleaning the {Pool}},
author = {Huseljic, Denis and Herde, Marek and Rauch, Lukas and Hahn, Paul and Sick, Bernhard},
booktitle = {CVPR},
year = {2026},
}