GIMBench

GIMBench is a benchmarking framework for evaluating Guided Infilling Models (GIM).

Overview

This project provides tools and benchmarks to evaluate models' ability to perform guided infilling tasks - generating text that follows specific constraints and patterns.

Installation

Install GIMBench using pip:

pip install gimbench

For development:

make install-dev

Usage

GIMBench provides several benchmark types:

CV Parsing: Evaluate models on structured information extraction from CVs
Regex Matching: Test models' ability to generate text matching specific patterns
Multiple Choice QA: Assess guided generation in question-answering contexts
Perplexity: Measure language modeling quality with constraints

Example Commands

Run MMLU-Pro benchmark:

python -m gimbench.mcqa.mmlu_pro \
    --model_type vllm \
    --model_name meta-llama/Llama-3.1-8B-Instruct \
    --base_url http://localhost:8000/v1

Run GPQA Diamond benchmark:

python -m gimbench.mcqa.gpqa_diamond \
    --model_type openai \
    --model_name gpt-4 \
    --api_key YOUR_API_KEY

Run GIM-SFT perplexity evaluation:

python -m gimbench.ppl.gim_sft \
    --model_type vllm-offline \
    --model_name meta-llama/Llama-3.1-8B-Instruct

Development

Run linting:

make lint

Fix linting issues automatically:

make lint-fix

Run pre-commit hooks:

make pre-commit

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
.github/workflows		.github/workflows
.vscode		.vscode
benchmarks		benchmarks
src/gimbench		src/gimbench
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GIMBench

Overview

Installation

Usage

Example Commands

Development

About

Uh oh!

Releases 9

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

SculptAI/GIMBench

Folders and files

Latest commit

History

Repository files navigation

GIMBench

Overview

Installation

Usage

Example Commands

Development

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 9

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages