Benchmark: LiveCodeBench — contamination-free code generation

## Overview
Evaluate OpenSymbolicAI against **LiveCodeBench** — a continuously refreshed, contamination-free coding benchmark sourcing problems from LeetCode, AtCoder, and CodeForces.

## Why this benchmark
- **Contamination-free**: regularly refreshed problems eliminate memorization advantage
- Tests code generation, self-repair, code execution, and test output prediction
- Well-known leaderboard tracked by Artificial Analysis
- Demonstrates OpenSymbolicAI works on algorithmic problem-solving, not just tool orchestration

## References
- [LiveCodeBench Leaderboard](https://artificialanalysis.ai/evaluations/livecodebench)

## Tasks
- [ ] Review LiveCodeBench evaluation format and API
- [ ] Design primitives for code generation and testing
- [ ] Implement benchmark harness
- [ ] Run evaluation and collect results
- [ ] Document findings

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark: LiveCodeBench — contamination-free code generation #29

Overview

Why this benchmark

References

Tasks

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Benchmark: LiveCodeBench — contamination-free code generation #29

Description

Overview

Why this benchmark

References

Tasks

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions