Forgather

Forgather is a configuration-driven ML framework that uses template inheritance and code generation to eliminate configuration duplication and enable systematic experimentation. Instead of copying and modifying entire config files, you inherit from base templates and specify only what changes.

Key Benefits:

No Config Duplication - Inherit and override instead of copy-paste
Types as Hyperparameters - Change optimizers, models, datasets in config files
Full Reproducibility - Automatic snapshots of code and configs with each run
Pipeline Parallel Trainer - Includes Pipeline Parallel trainer, optmized for training on consumer GPUs
Extensible Trainers - Easily extensible trainer implementation for modification and experimentation
Dynamic Models Library - Define and customize model architectures entirely through configurtion files
Templates Library - Extensive templates library for tokenizers, models, trainers, datasets, etc.

News

Dec 14 Multiple new features:
- Forgather's models now work with vLLM, in both Tensor and Pipeline parallel mode. See documentation.
- Added support for fused-linear-causal-loss, which significantly reduces peak memory requirements for training models with large vocabularies. example usage. We support the following implementations: Liger, CCE, PyTorch compiled
- Added a Triton implementation of Forgather's Adafactor Optimizer. This reduces peak memory further and speeds up training.
- Enabled support for loading models with device_map="auto", which allows our inference server to shard models across multiple GPUs.
Nov 17, Completed a major overhaul of the model conversion tool and added support for Mistral, Qwen3 models, and Llama models with RoPE scaling and tied word embeddings.
Nov 9, OpenAssistant Dataset - High-quality example demonstrating how to build custom datasets that dynamically generate examples on-the-fly. Features quality-weighted sampling from conversation trees, sequence packing, multi-language support, and deterministic generation. Includes complete Python examples and extensive documentation. There is also a demo finetune project.
Nov 4, Added support for packed sequences and Flex Attention. Updating Samantha tutorial to demonstrate. Models now support KV cache.
Oct 21, H.P. Lovecraft Project - Learn how to create workspaces and projects, while training a model to summon the Elder Gods. You can perform full-finetuning (not LoRA) on a 7B model, with a context length of up to 16K on a single 24 GB GPU!
Oct 19, Samantha -- New tutorial on how to perform full finetuning on a 7B parameter model on a single, 24 GB, GPU on the "Samantha" dataset -- she believes she is sentient!
Torch Titan integration -- Use Forgather to configure Torch Titan

Quick Start

1. Install Forgather:

# Requires python >= 3.10
# Setup python virtual envrionment for install
# You can also use conda or whatever you are most comfortable with.
python3 -m venv /path/to/new/venv

# Activate the virtaul environment
source /path/to/new/venv/bin/activate

git clone https://github.com/jdinalt/forgather.git
cd forgather
pip install -e .

# Verify install works with CLI
forgather ls -r

Note: We are using bleeding-edge PyTorch features, like flex-attention, which require PyTorch 2.9.0. If you are updating from a previous install, run 'pip install -e .' again to force uprading to the latest libraries. If in doubt, nuke your venv and rebuild it.

Flex attention also depends upon having a working C compiler and python development packages installed.

sudo apt-get install build-essential python3-dev

2. Try a tutorial project:

See: ./examples/tutorials/tiny_llama/project_index.ipynb

Or, from the comamand-line...

# Optional
forgather -i                                    # Start interactive Forgather shell

forgather ls -r                                 # List all forgather projects
cd examples/tutorials/tiny_llama
forgather index                                 # Show project summary
forgather ls                                    # List available configs
forgather -t train_tiny_llama.yaml pp | less    # Show pre-processed configuration
forgather -t train_tiny_llama.yaml train        # Train model

3. Monitor and control:

forgather -t train_tiny_llama.yaml tb           # Start Tensorboard

forgather control list                          # List running traininig jobs
forgather control status JOB_ID                 # Get status of training job
fogrgater control [stop|abort|save] JOB_ID      # Control training jobs

4. Test Model Inference:

# Start inference server
forgather inf server -c -m /path/to/model

# Perform text completion on prompt
forgather inf client --completion "Once upon a time"

That's it! You've just trained a small language model using Forgather's template system.

Key Features

Template Inheritance

Create new experiments by inheriting from existing configs and specifying only the differences:

-- extends 'base_experiment.yaml'
[optimizer]
    == super()
    lr: 1.0e-3  # Only change learning rate

Dynamic Type System

Use any Python class or function directly in configs:

optimizer: !partial:torch.optim.AdamW
    lr: 1.0e-3
    weight_decay: 0.01

[layer_factory]
# Experiment: Switch from PreLayerNorm to PostLayerNorm
layer_factory: &layer_factory !partial:.post_ln_layer:PostLNLayer@layer_factory
    feedforward_factory: *feedforward_factory
    attention_factory: *attention_factory
    norm_factory: *layer_norm_factory
    dropout: !var "layer_dropout"
    residual_dropout: !var "residual_dropout"

Code Generation

Models are generated as standalone Python code with no framework dependencies:

Built-in Training Infrastructure

Trainer: Fast single-GPU training for small models.
AccelTrainer: Multi-GPU with Accelerate
PipelineTrainer: Pipeline parallelism
Custom Optimizers: AdamW, AdaFactor, GaLore, Apollo

Learning Forgather

1. Start with Tutorials (Recommended)

cd examples/tutorials/

tiny_llama/ - Train a small language model from scratch
project_composition/ - Template inheritance patterns
dynamic_lm/ - Dynamic model construction
projects_overview/ - Overview of Forgather projects

2. Explore Example Projects

cd forgather

# List all example projects and configurations
forgather ls -r

# cd to example project directory
cd examples/...

# Show project info
forgather index

3. Interactive Development

Run the interactive shell.

forgather -i

Core Concepts

Projects

Every Forgather experiment is a Project with this structure:

my_project/
├── meta.yaml              # Project metadata
├── templates/
│   ├── project.yaml       # Main template
│   └── configs/           # Experiment configs
├── output_models/         # Generated code & results
└── project_index.ipynb    # Interactive notebook

Template Language

Forgather uses Jinja2 + YAML with custom syntax:

-- extends 'template.yaml' - Template inheritance
[block_name] - Override sections
-- set ns.var = value - Set variables
!partial:module:Class - Partial function construction
!factory:module:Class - Factory construction
!var "variable_name" - Variable references
#---- inline.template.name ---- - Split document into multiple templates

See Syntax Reference

Code Generation Pipeline

Templates → YAML → Node Graph → Python Code → Executable Objects

Each step can be inspected:

forgather -t config.yaml pp                          # Preprocess with Jinja2 to YAML
forgather -t config.yaml graph --format yaml         # Parsed node graph
forgather -t config.yaml targets                     # List constructable objects in graph
forgather -t config.yaml code [--target <target>]    # [optional] Equivalent Python code for target
forgather -t config.yaml construct [--target <target>] [--call] # Materialize and show constructed object

Project Structure

Framework Code

src/forgather/ - Core framework
- project.py - Project management
- config.py - Template processing
- codegen.py - Python code generation
- ml/ - Training infrastructure
templatelib/ - Reusable templates
- base/ - Abstract base templates
- examples/ - Common models, datasets, tokenizers
modelsrc/ - Modular model components library

Example Projects

examples/tutorials/ - Learning materials
examples/tiny_experiments/ - Example experiments
examples/standalone/ - Self-contained projects
examples/template_project - Starting point for new projects.

Training and Monitoring

# Generate command to run "my_experiment.yaml" on GPUs 0 and 1
# Print command, but don't execute it.
forgather -t my_experiment.yaml train -d "0,1" --dry-run

# Start Tensorboard to monitor progress on all models in project; bind to all ports.
forgather tb --all -- --bind_all

# Show running training jobs -- which can be controlled via the CLI
forgather control list

Contributing

Forgather is actively developed and welcomes contributions:

Bug Reports & Feature Requests: Open GitHub issues
Code Contributions: Submit pull requests
Documentation: Improve tutorials and examples
Community: Share your experiments and templates

Name		Name	Last commit message	Last commit date
Latest commit History 521 Commits
chat_templates		chat_templates
docs		docs
examples		examples
forgather_workspace		forgather_workspace
modelsrc		modelsrc
scripts		scripts
src/forgather		src/forgather
syntax_highlighting		syntax_highlighting
templatelib		templatelib
tests		tests
tools		tools
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Forgather

News

Table of Contents

Quick Start

Key Features

Template Inheritance

Dynamic Type System

Code Generation

Built-in Training Infrastructure

Learning Forgather

1. Start with Tutorials (Recommended)

2. Explore Example Projects

3. Interactive Development

Core Concepts

Projects

Template Language

Code Generation Pipeline

Project Structure

Framework Code

Example Projects

Training and Monitoring

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

jdinalt/forgather

Folders and files

Latest commit

History

Repository files navigation

Forgather

News

Table of Contents

Quick Start

Key Features

Template Inheritance

Dynamic Type System

Code Generation

Built-in Training Infrastructure

Learning Forgather

1. Start with Tutorials (Recommended)

2. Explore Example Projects

3. Interactive Development

Core Concepts

Projects

Template Language

Code Generation Pipeline

Project Structure

Framework Code

Example Projects

Training and Monitoring

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages