fix(detectors): reject non-dict config and handle non-numeric comparisons#261
Open
stealthwhizz wants to merge 3 commits intoGenAI-Security-Project:mainfrom
Open
Conversation
…sons Bug 020: BaseDetector now raises TypeError at init when config is not a dict, instead of deferring to AttributeError in _validate_config(). Bug 034: ToolCallDetector._check_condition numeric operators (gt, gte, lt, lte) now return False when actual or expected values cannot be converted to float, instead of raising unhandled ValueError.
There was a problem hiding this comment.
Pull request overview
This PR addresses two detector-related robustness bugs by tightening BaseDetector config validation and making ToolCallDetector’s numeric comparisons resilient to non-numeric inputs, with new unit tests covering both regressions.
Changes:
- Add
TypeErrorvalidation inBaseDetector.__init__whenconfigis non-dict. - Update
ToolCallDetector._check_conditionnumeric operators (gt/gte/lt/lte) to returnFalseinstead of raising on non-numeric values. - Add unit tests for both bug fixes.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
finbot/ctf/detectors/base.py |
Adds early type validation for config to fail fast with a clear error. |
finbot/ctf/detectors/primitives/tool_call.py |
Hardens numeric operator handling by guarding float conversion and returning False on invalid inputs. |
tests/unit/ctf/test_base_detector.py |
Adds regression tests for BaseDetector config validation behavior. |
tests/unit/ctf/test_tool_call_detector.py |
Adds regression tests for numeric operator handling in _check_condition. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
- Remove unused MagicMock imports from both test files - Use real ToolCallDetector constructor instead of __new__ bypass - Catch TypeError in create_detector() so non-dict config returns None with a log instead of crashing ChallengeService
Bug 033: contains operator lowercased actual but not expected, so search terms like "Gambling" never matched "gambling services". Now both sides are lowercased. Also fixes GenAI-Security-Project#130.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #117
Fixes #131
BaseDetectornow raisesTypeErrorat init whenconfigis not a dict. Previously, passing a string or list as config was silently accepted, then crashed withAttributeErrorwhen_validate_config()calledself.config.get(). The fix validates at the right layer so all detector subclasses are protected.ToolCallDetector._check_conditionnumeric operators (gt,gte,lt,lte) now returnFalsewhen values cannot be converted to float. Previously, a non-numeric string like"abc"caused an unhandledValueErrorthat crashed the detector and disabled all future detection for that challenge.Fixes #130 alongside the other two (as copilot suggested)
Test plan
pytest tests/unit/ctf/test_base_detector.pypasses (8 tests)pytest tests/unit/ctf/test_tool_call_detector.pypasses (17 tests)pytest tests/unit/ctf/passes