How to implement that is written in the doc. I can't find code reference. #3448

miyannishar · 2025-11-07T05:23:42Z

miyannishar
Nov 7, 2025

how to implement these evaluations types in the tools trajectory.

llalitkumarrr · 2025-11-07T07:22:35Z

llalitkumarrr
Nov 7, 2025
Maintainer

Could you please elaborate on your question a bit more so that we can understand your query clearly and assist you better?

0 replies

miyannishar · 2025-11-07T07:27:27Z

miyannishar
Nov 7, 2025
Author

I am implementing the Evaluation of my Agentic System using pytest and I want to test the agent if its calling the tools in a series. Here in the docs its mentioned that we can evaluate the agent's tool call using exact match, in order, any order precision and so on so forth but for implementation, we only have one criteria: tool_trajectory_avg_score that only look for exact match. My question is how to implement the evaluation of tool call based on any order match instead of exact match only.

I am using LiteLLM, so using LLMAsAJudge is not an option for me.

0 replies

llalitkumarrr · 2025-11-07T08:47:03Z

llalitkumarrr
Nov 7, 2025
Maintainer

Hello @miyannishar,

Here on this page, you can find all the Evaluation Criteria available with ADK. If the tool_trajectory_avg_score metric doesn’t meet your requirement, you could explore rubric_based_tool_use_quality_v1. It is important to note that this evaluation method uses an LLM as a judge.

At present, these two are the recommended criteria for evaluating tool trajectories.

0 replies

hangfei · 2025-11-07T23:47:24Z

hangfei
Nov 7, 2025
Maintainer

@adk-bot

0 replies

adk-bot · 2025-11-07T23:48:12Z

adk-bot
Nov 7, 2025
Maintainer

Response from ADK Answering Agent (experimental, answer may be inaccurate)

TLDR: To implement tool trajectory evaluations, you need to create a JSON configuration file (e.g., test_config.json) and specify the tool_trajectory_avg_score criterion. You can then run the evaluation using the adk eval command-line tool, the adk web UI, or programmatically with pytest.

Hello! You've asked a great question about how to implement the evaluation types mentioned in the ADK documentation.

You are correct that the documentation describes what to evaluate. The how is handled by creating a JSON configuration file to specify which evaluation criteria you want to use. For the "Exact match" on tool trajectory that you referenced, the criterion is named tool_trajectory_avg_score.

Here is an example of what your test_config.json file would look like to enable this check [2]:

{
  "criteria": {
    "tool_trajectory_avg_score": 1.0
  }
}

By setting tool_trajectory_avg_score to 1.0, you are asserting that the tool calls in your test case must exactly match the expected tool calls in both name and arguments to pass.

Once you have your configuration file, you can run the evaluation in one of three ways [1]:

Command-Line: Use the adk eval command.
Web UI: Launch the ADK web interface with adk web.
Programmatically: Run the evaluation using pytest.

For a code reference, the logic for the tool trajectory evaluation is implemented in the trajectory_evaluator.py file. This file contains the code that compares the predicted tool calls against the golden tool calls [3].

You can also see a practical example in the BigQuery agent integration test, which uses adk eval with a test_config.json file to run evaluations [4].

I hope this helps you get started with implementing your evaluations!

[1] https://google.github.io/adk-docs/evaluate/
[2] https://google.github.io/adk-docs/evaluate/criteria/
[3] https://github.com/google/adk-python/blob/main/src/google/adk/evaluation/trajectory_evaluator.py
[4] https://github.com/google/adk-python/blob/main/tests/integration/fixture/bigquery_agent/README.md

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to implement that is written in the doc. I can't find code reference. #3448

Uh oh!

{{title}}

Uh oh!

Replies: 5 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to implement that is written in the doc. I can't find code reference. #3448

Uh oh!

miyannishar Nov 7, 2025

Replies: 5 comments

Uh oh!

llalitkumarrr Nov 7, 2025 Maintainer

Uh oh!

Uh oh!

miyannishar Nov 7, 2025 Author

Uh oh!

Uh oh!

llalitkumarrr Nov 7, 2025 Maintainer

Uh oh!

hangfei Nov 7, 2025 Maintainer

Uh oh!

adk-bot Nov 7, 2025 Maintainer

miyannishar
Nov 7, 2025

llalitkumarrr
Nov 7, 2025
Maintainer

miyannishar
Nov 7, 2025
Author

llalitkumarrr
Nov 7, 2025
Maintainer

hangfei
Nov 7, 2025
Maintainer

adk-bot
Nov 7, 2025
Maintainer