Fix DSPy 3.2+ API compat (max_full_evals + metric kwargs) and 2 evolve_skill validator bugs#73
Open
0L1v3DaD wants to merge 4 commits into
Open
Conversation
DSPy 3.2 renamed dspy.GEPA iteration arg from max_steps to max_full_evals, and now requires a reflection_lm parameter; without it the optimizer falls back to a default LM that may not be configured. Before: TypeError: GEPA.compile() got unexpected keyword argument max_steps
dspy.GEPA in 3.2+ calls the metric with five positional args: metric(gold, prediction, trace, pred_name, pred_trace) The old 3-arg signature crashes: TypeError: skill_fitness_metric() takes 2-3 args but 4 were given Adding pred_name and pred_trace keyword args satisfies GEPA while remaining backward-compatible with MIPROv2 and BootstrapFewShot (which only pass the first 3).
…_text This is the most impactful fix in the bundle. Before: after GEPA compile() returned, the code read evolved_body = optimized_module.skill_text But skill_text is the INPUT field on SkillModule (the original unchanged skill we fed in). What GEPA actually mutates each iteration is the predictor signature.instructions - the prompt prefix that gets composed with the input to produce the output. Symptom: GEPA appears to succeed (no errors, full convergence reported, all constraint gates pass), but the saved file is byte-identical to the input baseline. Zero learning, significant token spend per run, and the bug is invisible because the unchanged baseline trivially passes all gates (0% growth, valid structure, etc.). Fix: - Extract evolved_instruction from optimized_module.predictor.predict.signature.instructions - Add fallback for MIPROv2 flat predictor structure - Compare against baseline_instruction and warn if no improvement - Log evolved-prompt size vs baseline-prompt size on success - Preserve original body (GEPA never had access to mutate it)
The skill_structure constraint checks that the artifact starts with YAML frontmatter (---). Frontmatter is added by reassemble_skill() at line 217, but the validator was being called on evolved_body (the body-only string) on line 219. Result: false negatives like skill_structure: Skill missing: YAML frontmatter (---), name field, description field even though the file written to disk a few lines later (line 261) DOES have valid frontmatter. Confusing failure mode and a useful evolution gets rejected as failed. Fix: pass evolved_full to validate_all so all four constraint gates operate on the same artifact that ends up on disk.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Four small but high-impact fixes to make
evolution.skills.evolve_skillwork end-to-end on a fresh install with DSPy 3.2+. Tested empirically with
a 5-iteration GEPA run on a real skill against ~30 evaluation examples.
Bugs fixed (one commit each)
evolution/skills/evolve_skill.pyTypeError: GEPA.compile() got unexpected keyword 'max_steps'— DSPy 3.2 renamed it tomax_full_evalsand added a requiredreflection_lmargumentevolution/core/fitness.pyTypeError: skill_fitness_metric() takes 2-3 args but 4 were given— DSPy 3.2 GEPA passes(gold, pred, trace, pred_name, pred_trace), the metric only accepted 3evolution/skills/evolve_skill.pyoptimized_module.skill_text(the input field) instead ofoptimized_module.predictor.predict.signature.instructions(the actual evolved prompt)evolution/skills/evolve_skill.pyskill_structure: missing YAML frontmattereven though the file written to disk had perfect frontmatter. Cause:validator.validate_all(evolved_body, ...)was called on the body-only string a few lines beforereassemble_skill()prepended the frontmatterWhy this matters
The silent no-op (commit 3) is the worst of the four because it's invisible:
Once that's fixed, the validator wiring bug (commit 4) becomes visible
and easy to fix.
Verification
After all four commits applied:
size_limit,growth_limit,non_empty,skill_structureBackward compatibility
All four fixes are additive or substitute-equivalent on the DSPy 3.2+ path:
max_full_evalsis the new name;max_stepswas dropped, no compat shim possiblereflection_lmis now required by GEPA; we now provide one explicitly using the existingconfig.optimizer_modelvalidate_all— all constraint checks remain validTest plan
Reproduced before and after each individual commit:
max_steps