Skip to content

fix: standardize token estimation heuristic between file_operations and base_agent#249

Closed
RyanLHicks wants to merge 1 commit intompfaffenberger:mainfrom
RyanLHicks:fix/standardize-token-estimation-heuristic
Closed

fix: standardize token estimation heuristic between file_operations and base_agent#249
RyanLHicks wants to merge 1 commit intompfaffenberger:mainfrom
RyanLHicks:fix/standardize-token-estimation-heuristic

Conversation

@RyanLHicks
Copy link
Copy Markdown
Contributor

@RyanLHicks RyanLHicks commented Mar 22, 2026

`file_operations.py` estimates tokens using `len(content) // 4` 
(~4 chars/token) while `base_agent.py` uses `len(text) / 2.5` 
(~2.5 chars/token). The same content is estimated at nearly 2x 
different token counts depending on which function is counting.

This causes files to pass the 10,000 token read limit in 
`_read_file` but land in message history at ~1.6x that estimate, 
triggering unexpected early compaction.

Standardized `_read_file` to use the same 2.5 chars/token 
heuristic as `BaseAgent.estimate_token_count`.

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

* **Tests**
  * Added test coverage to validate token estimation consistency across varying content sizes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

📝 Walkthrough

Walkthrough

A new test suite validates token estimation consistency by comparing CodePuppyAgent.estimate_token_count against a heuristic formula computing floor(len(content) / 2.5) across multiple content sizes and edge cases.

Changes

Cohort / File(s) Summary
Token Estimation Test Suite
tests/agents/test_token_estimation_consistency.py
New test file with TestTokenEstimationConsistency class containing three test methods validating token count estimation consistency using a chars/token heuristic across varying content sizes and edge cases.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 With tokens counted, steady and true,
Our tests now verify what agents do—
From bytes to words, consistency shines,
Each heuristic checked across the lines!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: standardizing a token estimation heuristic between two modules. It clearly identifies the key objective of the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@code_puppy/tools/file_operations.py`:
- Around line 516-517: The token estimation in file_operations.py currently
computes num_tokens = math.floor(len(content) / 2.5) which can yield 0 for short
content; update this to mirror BaseAgent.estimate_token_count by enforcing a
minimum of 1 (e.g., set num_tokens = max(1, math.floor(len(content) / 2.5))) so
short strings never produce zero tokens and remain consistent with BaseAgent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 63969bdc-03a9-4258-b293-86ee402d7316

📥 Commits

Reviewing files that changed from the base of the PR and between 19e3122 and a7a1181.

📒 Files selected for processing (1)
  • code_puppy/tools/file_operations.py

Comment thread code_puppy/tools/file_operations.py Outdated
@RyanLHicks RyanLHicks force-pushed the fix/standardize-token-estimation-heuristic branch from b1e4f82 to 463d7c6 Compare March 22, 2026 20:45
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
tests/agents/test_token_estimation_consistency.py (2)

1-10: Consider importing and testing file_operations._read_file directly.

The test suite claims to verify consistency between file_operations._read_file and BaseAgent.estimate_token_count, but only imports and tests CodePuppyAgent. If someone later modifies the token estimation in file_operations.py, these tests won't catch the regression.

Consider importing the actual file_operations function (or extracting the shared heuristic into a common utility) and asserting both implementations produce identical results.

💡 Suggested approach
 import math
 
 from code_puppy.agents.agent_code_puppy import CodePuppyAgent
+from code_puppy.tools.file_operations import _estimate_tokens  # or similar shared function

Alternatively, extract the heuristic into a shared utility function that both modules import, then test that single source of truth.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/agents/test_token_estimation_consistency.py` around lines 1 - 10, The
test currently only imports CodePuppyAgent; update the test to import
file_operations._read_file (or the shared heuristic utility if you extracted
one) and assert its token estimate matches BaseAgent.estimate_token_count (or
CodePuppyAgent.estimate_token_count) for the same inputs; specifically, add an
import for file_operations._read_file and include assertions that
file_operations._read_file(...) output (or the heuristic function result)
produces the same chars-per-token based estimate as
BaseAgent.estimate_token_count (or CodePuppyAgent.estimate_token_count) for the
sample files/strings used in the test so changes in file_operations are
detected.

16-27: Variable name file_ops_estimate is misleading.

The variable file_ops_estimate is computed from a hardcoded formula, not from the actual file_operations module. This naming could mislead future maintainers into thinking the test verifies the real implementation.

✏️ Suggested rename for clarity
         base_agent_estimate = agent.estimate_token_count(content)
-        file_ops_estimate = math.floor(len(content) / 2.5)
+        expected_heuristic = math.floor(len(content) / 2.5)
 
-        assert base_agent_estimate == file_ops_estimate
+        assert base_agent_estimate == expected_heuristic
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/agents/test_token_estimation_consistency.py` around lines 16 - 27,
Rename the misleading variable file_ops_estimate in
test_estimate_token_count_matches_file_operations_heuristic to reflect that it
is derived from the hardcoded 2.5 chars/token formula (e.g., heuristic_estimate
or expected_estimate); update the variable name wherever used and keep the
calculation math.floor(len(content) / 2.5) and the assertion comparing
agent.estimate_token_count(content) to that renamed variable so the test clearly
documents it's validating the 2.5 chars/token heuristic rather than calling
file_operations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/agents/test_token_estimation_consistency.py`:
- Around line 45-52: Docstring and assertion are inconsistent: update the test
for CodePuppyAgent. Change the docstring to state that CodePuppyAgent enforces a
minimum of 1 token, and tighten the assertion to expect exactly 1 by replacing
"assert result >= 1" with "assert result == 1" when calling
CodePuppyAgent().estimate_token_count("") (referencing CodePuppyAgent and
estimate_token_count).

---

Nitpick comments:
In `@tests/agents/test_token_estimation_consistency.py`:
- Around line 1-10: The test currently only imports CodePuppyAgent; update the
test to import file_operations._read_file (or the shared heuristic utility if
you extracted one) and assert its token estimate matches
BaseAgent.estimate_token_count (or CodePuppyAgent.estimate_token_count) for the
same inputs; specifically, add an import for file_operations._read_file and
include assertions that file_operations._read_file(...) output (or the heuristic
function result) produces the same chars-per-token based estimate as
BaseAgent.estimate_token_count (or CodePuppyAgent.estimate_token_count) for the
sample files/strings used in the test so changes in file_operations are
detected.
- Around line 16-27: Rename the misleading variable file_ops_estimate in
test_estimate_token_count_matches_file_operations_heuristic to reflect that it
is derived from the hardcoded 2.5 chars/token formula (e.g., heuristic_estimate
or expected_estimate); update the variable name wherever used and keep the
calculation math.floor(len(content) / 2.5) and the assertion comparing
agent.estimate_token_count(content) to that renamed variable so the test clearly
documents it's validating the 2.5 chars/token heuristic rather than calling
file_operations.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 06e0f0cb-3c6d-4426-a123-7e1d7d1bb9cb

📥 Commits

Reviewing files that changed from the base of the PR and between b1e4f82 and 463d7c6.

📒 Files selected for processing (1)
  • tests/agents/test_token_estimation_consistency.py

Comment on lines +45 to +52
def test_minimum_token_count_is_one(self):
"""
Both estimators enforce a minimum of 1 token even for empty content.
"""
agent = CodePuppyAgent()

result = agent.estimate_token_count("")
assert result >= 1 No newline at end of file
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Docstring claims to test "Both estimators" but only tests one.

The docstring states "Both estimators enforce a minimum of 1 token" but the test only verifies BaseAgent.estimate_token_count. Additionally, the assertion >= 1 is weaker than necessary—the implementation (per context snippet) returns exactly 1 for empty input via max(1, ...).

🔧 Suggested fix
     def test_minimum_token_count_is_one(self):
         """
-        Both estimators enforce a minimum of 1 token even for empty content.
+        estimate_token_count enforces a minimum of 1 token even for empty content.
         """
         agent = CodePuppyAgent()
 
         result = agent.estimate_token_count("")
-        assert result >= 1
+        assert result == 1

Or better, actually test both estimators as the original docstring intended.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def test_minimum_token_count_is_one(self):
"""
Both estimators enforce a minimum of 1 token even for empty content.
"""
agent = CodePuppyAgent()
result = agent.estimate_token_count("")
assert result >= 1
def test_minimum_token_count_is_one(self):
"""
estimate_token_count enforces a minimum of 1 token even for empty content.
"""
agent = CodePuppyAgent()
result = agent.estimate_token_count("")
assert result == 1
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/agents/test_token_estimation_consistency.py` around lines 45 - 52,
Docstring and assertion are inconsistent: update the test for CodePuppyAgent.
Change the docstring to state that CodePuppyAgent enforces a minimum of 1 token,
and tighten the assertion to expect exactly 1 by replacing "assert result >= 1"
with "assert result == 1" when calling CodePuppyAgent().estimate_token_count("")
(referencing CodePuppyAgent and estimate_token_count).

@RyanLHicks
Copy link
Copy Markdown
Contributor Author

Closing — the token estimation fix and tests are included in
PR #250 alongside the ripgrep fallback fix. Consolidating there.

@RyanLHicks RyanLHicks closed this Mar 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant