Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
080cb84
Move ATIF models out of core
AnuradhaKaruppiah Apr 2, 2026
c1e5c6a
Move EvalOutput to nvidia-nat-eval
AnuradhaKaruppiah Apr 2, 2026
285ea9e
Update eval deps
AnuradhaKaruppiah Apr 2, 2026
8ab3ccb
Add hint guards
AnuradhaKaruppiah Apr 2, 2026
67d7396
Update notebooks
AnuradhaKaruppiah Apr 2, 2026
71a1b45
Update tests
AnuradhaKaruppiah Apr 2, 2026
6377f03
Update uv lock
AnuradhaKaruppiah Apr 2, 2026
82eecd6
Update docs
AnuradhaKaruppiah Apr 2, 2026
c5577e8
Create a shared contracts package
AnuradhaKaruppiah Apr 2, 2026
8205c72
Update uv lock files
AnuradhaKaruppiah Apr 2, 2026
63d0cac
Drop test sections from contract packages
AnuradhaKaruppiah Apr 2, 2026
c8252c1
uv lock update
AnuradhaKaruppiah Apr 2, 2026
8a90e77
Update stale imports
AnuradhaKaruppiah Apr 2, 2026
cdace7e
Fix path checks
AnuradhaKaruppiah Apr 2, 2026
607b94f
Drop shared_contracts package
AnuradhaKaruppiah Apr 2, 2026
b46b8ec
Update uv locks
AnuradhaKaruppiah Apr 2, 2026
0bdc818
Limit BaseEvaluator availability to the full package
AnuradhaKaruppiah Apr 2, 2026
cb4bbd5
Remove pydantic as a direct dep
AnuradhaKaruppiah Apr 2, 2026
879ad7e
Add a eval-full extra
AnuradhaKaruppiah Apr 2, 2026
283515b
update uv locks and docs
AnuradhaKaruppiah Apr 2, 2026
f11e211
Make dependencies in nvidia-nat-core dynamic to allow inclusion of atif
AnuradhaKaruppiah Apr 2, 2026
1715109
Fix F401 that came from importing BaseEvaluator only for side-effect …
AnuradhaKaruppiah Apr 2, 2026
38099ec
Add eval-base and eval-full extras
AnuradhaKaruppiah Apr 2, 2026
66e8b99
Update stale references
AnuradhaKaruppiah Apr 2, 2026
6e4b52b
Update docs/source/get-started/installation.md
AnuradhaKaruppiah Apr 2, 2026
4175aff
Update docs/source/improve-workflows/evaluate.md
AnuradhaKaruppiah Apr 2, 2026
fe4f5d0
Update docs in reponse to review comments
AnuradhaKaruppiah Apr 2, 2026
7fd7ade
Updated reference to uv
AnuradhaKaruppiah Apr 2, 2026
637429d
Fix vale warnings
AnuradhaKaruppiah Apr 3, 2026
b73421d
pre-commit run fixes
AnuradhaKaruppiah Apr 3, 2026
f3bdc49
Temporarily remove cursor URLs as they are resulting in CI failures w…
AnuradhaKaruppiah Apr 3, 2026
342484c
Merge remote-tracking branch 'upstream/develop' into ak-eval-no-core
AnuradhaKaruppiah Apr 3, 2026
5589f7b
Revert "Temporarily remove cursor URLs as they are resulting in CI fa…
AnuradhaKaruppiah Apr 3, 2026
703a6fd
Add eval to nat-weave's test deps
AnuradhaKaruppiah Apr 3, 2026
f6905d3
skip nat cli version checks if nat is not installed by a package
AnuradhaKaruppiah Apr 3, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 11 additions & 7 deletions ci/scripts/github/build_wheel.sh
Original file line number Diff line number Diff line change
Expand Up @@ -113,13 +113,17 @@ for whl in "${MOVED_WHEELS[@]}"; do
exit ${IMPORT_TEST_RESULT}
fi

REPORTED_VERSION=$(nat --version 2>&1)
NAT_CMD_EXIT_CODE=$?

if [[ ${NAT_CMD_EXIT_CODE} -ne 0 ]]; then
rapids-logger "Error 'nat --version' command failed exit code ${NAT_CMD_EXIT_CODE} from wheel ${whl} with Python ${pyver}"
echo "${REPORTED_VERSION}"
exit ${NAT_CMD_EXIT_CODE}
if command -v nat >/dev/null 2>&1; then
REPORTED_VERSION=$(nat --version 2>&1)
NAT_CMD_EXIT_CODE=$?

if [[ ${NAT_CMD_EXIT_CODE} -ne 0 ]]; then
rapids-logger "Error 'nat --version' command failed exit code ${NAT_CMD_EXIT_CODE} from wheel ${whl} with Python ${pyver}"
echo "${REPORTED_VERSION}"
exit ${NAT_CMD_EXIT_CODE}
fi
else
rapids-logger "Skipping nat CLI test; 'nat' command not installed by wheel ${whl}"
fi
else
rapids-logger "Skipping nat CLI test for nvidia_nat_app (framework-agnostic package); verifying nat_app import"
Expand Down
5 changes: 4 additions & 1 deletion docs/source/extend/custom-components/custom-evaluator.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,9 @@ You can also author a custom evaluator that only implements ATIF-native scoring
When using `AtifBaseEvaluator`, implement `evaluate_atif_item` and reuse the built-in concurrent `evaluate_atif_fn`.
This is useful when your scoring logic consumes canonical ATIF trajectories directly.

This example uses evaluator registration (`@register_evaluator`) and therefore requires full runtime dependencies (`nvidia-nat-eval[full]`).
Base `nvidia-nat-eval` is sufficient for standalone ATIF harness usage without workflow or plugin registration.

The following example registers a minimal ATIF-only cosine-similarity evaluator:
`examples/evaluation_and_profiling/simple_web_query_eval/src/nat_simple_web_query_eval/atif_only_evaluator_register.py`:
```python
Expand All @@ -165,8 +168,8 @@ from pydantic import Field
from nat.builder.builder import EvalBuilder
from nat.builder.evaluator import EvaluatorInfo
from nat.cli.register_workflow import register_evaluator
from nat.data_models.evaluator import EvalOutputItem
from nat.data_models.evaluator import EvaluatorBaseConfig
from nat.plugins.eval.data_models.evaluator_io import EvalOutputItem
from nat.plugins.eval.evaluator.atif_base_evaluator import AtifBaseEvaluator
from nat.plugins.eval.evaluator.atif_evaluator import AtifEvalSample

Expand Down
3 changes: 2 additions & 1 deletion docs/source/get-started/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,8 @@ To install these first-party plugin libraries, you can use the full distribution
- `nvidia-nat[agno]` or `nvidia-nat-agno` - [Agno](https://agno.com/)
- `nvidia-nat[crewai]` or `nvidia-nat-crewai` - [CrewAI](https://www.crewai.com/) Conflicts with `nvidia-nat[openpipe-art]`.
- `nvidia-nat[data-flywheel]` or `nvidia-nat-data-flywheel` - [NeMo DataFlywheel](https://github.com/NVIDIA-AI-Blueprints/data-flywheel)
- `nvidia-nat[eval]` or `nvidia-nat-eval` - Evaluation orchestration package
- `nvidia-nat[eval]` or `nvidia-nat-eval[full]` - Full evaluation runtime dependencies for config-driven `nat eval` workflows
- `nvidia-nat-eval` - Evaluation package for ATIF-native and standalone custom evaluator workflows
- `nvidia-nat[langchain]` or `nvidia-nat-langchain` - [LangChain](https://www.langchain.com/), [LangGraph](https://www.langchain.com/langgraph)
- `nvidia-nat[llama-index]` or `nvidia-nat-llama-index` - [LlamaIndex](https://www.llamaindex.ai/)
- `nvidia-nat[mcp]` or `nvidia-nat-mcp` - [Model Context Protocol (MCP)](https://modelcontextprotocol.io/)
Expand Down
13 changes: 12 additions & 1 deletion docs/source/improve-workflows/evaluate.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,12 @@ NeMo Agent Toolkit provides a set of evaluators to run and evaluate workflows. I

## Prerequisites

In addition to the base `nvidia-nat` package, you need to install the evaluation package to use `nat eval`. Install the evaluation extra package with one of the following commands, depending on whether you installed the NeMo Agent Toolkit from source or from a package.
Choose the installation mode that matches your evaluation workflow:

- Standalone ATIF evaluation (`EvaluationHarness` plus ATIF-native custom evaluators): install base `nvidia-nat-eval`.
- Full `nat eval` runtime (workflow execution, dataset readers such as `csv`/`parquet`/`xls`, and config-driven evaluators): install `nvidia-nat[eval]`.

For source installs:

::::{tab-set}
:sync-group: install-tool
Expand All @@ -50,6 +55,12 @@ uv pip install "nvidia-nat-eval"
::::


For package installs, use the NeMo Agent Toolkit `metapackage` to run `nat eval`:

```bash
uv pip install "nvidia-nat[eval]"
```

If you plan to run profiling via `nat eval` (for example, when `eval.general.profiler` is enabled), install the profiler package as well:

::::{tab-set}
Expand Down
6 changes: 5 additions & 1 deletion docs/source/reference/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -555,7 +555,11 @@ nat serve --config_file=path/to/config --host 0.0.0.0 --port 8000
The Swagger API docs will be available at: [http://localhost:8000/docs](http://localhost:8000/docs)

## Evaluation
The `nat eval` command is provided by the `nvidia-nat-eval` package. Install evaluation support with `pip install "nvidia-nat[eval]"` or `pip install nvidia-nat-eval`.
The `nat eval` command is provided by the `nvidia-nat-eval` package.

For full config-driven `nat eval` runtime paths, install `uv pip install "nvidia-nat[eval]"`.

For ATIF-native standalone custom-evaluator paths, install `uv pip install nvidia-nat-eval`.

The `nat eval` command provides access a set of evaluators designed to assessing the accuracy of NeMo Agent Toolkit workflows as
well as instrumenting their performance characteristics. Please reference
Expand Down
5 changes: 3 additions & 2 deletions docs/source/resources/migration-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ This is a breaking change:

To migrate:
- Install both packages when using these evaluators:
- `pip install nvidia-nat-eval nvidia-nat-langchain`
- `pip install "nvidia-nat-eval" nvidia-nat-langchain`
- Install the RAGAS evaluator package when using `_type: ragas`:
- `pip install nvidia-nat-ragas`
- Install the profiler package when using performance evaluators or profiling workflows:
Expand Down Expand Up @@ -90,7 +90,8 @@ CLI command ownership is now aligned to package domains:

To migrate:
- Install command-specific packages as needed:
- `pip install nvidia-nat-eval`
- `pip install "nvidia-nat[eval]"`
- `pip install "nvidia-nat-eval[full]"`
- `pip install nvidia-nat-profiler`
- `pip install nvidia-nat-security`

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ The `--override` flag accepts a dot-notation path into the YAML config hierarchy
## Structured Evaluation Experiments

:::{note}
The `nat eval` command is provided by the evaluation package. If the command is not available, install the eval extra first:
The `nat eval` command is provided by the evaluation package. For full config-driven eval paths, install the full eval extra:

```bash
uv pip install -e '.[eval]'
Expand Down
20 changes: 17 additions & 3 deletions examples/A2A/currency_agent_a2a/uv.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading