From bba2bdbae92cb18db55c9c76236c82b7a57dbf46 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 9 Feb 2026 23:46:32 +0000 Subject: [PATCH 1/4] Initial plan From c29d42b77dd5253f78ab3216d7b82e9252eaaa95 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 9 Feb 2026 23:47:56 +0000 Subject: [PATCH 2/4] Add simple README with project overview and usage instructions Co-authored-by: Ki-Seki <60967965+Ki-Seki@users.noreply.github.com> --- README.md | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/README.md b/README.md index e69de29..d6ea6e8 100644 --- a/README.md +++ b/README.md @@ -0,0 +1,58 @@ +# GIMBench + +GIMBench is a benchmarking framework for evaluating Guided Infilling Models (GIM). + +## Overview + +This project provides tools and benchmarks to evaluate models' ability to perform guided infilling tasks - generating text that follows specific constraints and patterns. + +## Installation + +Install GIMBench using `uv`: + +```bash +make install +``` + +For development, install with dev dependencies: + +```bash +make install-dev +``` + +## Usage + +GIMBench provides several benchmark types: + +- **CV Parsing**: Evaluate models on structured information extraction from CVs +- **Regex Matching**: Test models' ability to generate text matching specific patterns +- **Multiple Choice QA**: Assess guided generation in question-answering contexts +- **Perplexity**: Measure language modeling quality with constraints + +## Development + +Run linting: + +```bash +make lint +``` + +Fix linting issues automatically: + +```bash +make lint-fix +``` + +Run pre-commit hooks: + +```bash +make pre-commit +``` + +## License + +This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. + +## Copyright + +Copyright (c) 2025 SculptAI From b72eeb90b33caf97e4752e839dad5316a8f5ae60 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 9 Feb 2026 23:53:47 +0000 Subject: [PATCH 3/4] Update README: use pip installation, add command examples, remove license sections Co-authored-by: Ki-Seki <60967965+Ki-Seki@users.noreply.github.com> --- README.md | 46 ++++++++++++++++++++++++++++++++++------------ 1 file changed, 34 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index d6ea6e8..7ecaf96 100644 --- a/README.md +++ b/README.md @@ -8,16 +8,18 @@ This project provides tools and benchmarks to evaluate models' ability to perfor ## Installation -Install GIMBench using `uv`: +Install GIMBench using pip: ```bash -make install +pip install gimbench ``` -For development, install with dev dependencies: +For development, clone the repository and install with dev dependencies: ```bash -make install-dev +git clone https://github.com/SculptAI/GIMBench.git +cd GIMBench +pip install -e ".[dev]" ``` ## Usage @@ -29,6 +31,34 @@ GIMBench provides several benchmark types: - **Multiple Choice QA**: Assess guided generation in question-answering contexts - **Perplexity**: Measure language modeling quality with constraints +### Example Commands + +Run MMLU-Pro benchmark: + +```bash +python -m gimbench.mcqa.mmlu_pro \ + --model_type vllm \ + --model_name meta-llama/Llama-3.1-8B-Instruct \ + --base_url http://localhost:8000/v1 +``` + +Run GPQA Diamond benchmark: + +```bash +python -m gimbench.mcqa.gpqa_diamond \ + --model_type openai \ + --model_name gpt-4 \ + --api_key YOUR_API_KEY +``` + +Run GIM-SFT perplexity evaluation: + +```bash +python -m gimbench.ppl.gim_sft \ + --model_type vllm-offline \ + --model_name meta-llama/Llama-3.1-8B-Instruct +``` + ## Development Run linting: @@ -48,11 +78,3 @@ Run pre-commit hooks: ```bash make pre-commit ``` - -## License - -This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. - -## Copyright - -Copyright (c) 2025 SculptAI From 41c7ee03f883197d2bdd2eeae16c970bd67fb490 Mon Sep 17 00:00:00 2001 From: Shichao Song <60967965+Ki-Seki@users.noreply.github.com> Date: Tue, 10 Feb 2026 08:04:13 +0800 Subject: [PATCH 4/4] Update README.md --- README.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 7ecaf96..acd040e 100644 --- a/README.md +++ b/README.md @@ -14,12 +14,10 @@ Install GIMBench using pip: pip install gimbench ``` -For development, clone the repository and install with dev dependencies: +For development: ```bash -git clone https://github.com/SculptAI/GIMBench.git -cd GIMBench -pip install -e ".[dev]" +make install-dev ``` ## Usage