docs and sim: improve setup and validation workflows by dcol91863 · Pull Request #559 · NVIDIA/Isaac-GR00T

dcol91863 · 2026-02-25T11:27:08Z

Summary

add a minimal CPU inference entry point and docs
add validation tooling smoke coverage and align the validation CLI with its documented flags
clarify custom embodiment workflows for LeRobot v3 to v2.1 conversion, dataset validation, and common metadata errors
pin the RoboCasa GR1 setup script to a stable robosuite release instead of tracking master

Validation

pytest -q tests/test_scripts_smoke.py
python scripts/validate_training_config.py --suggest --dataset demo_data/cube_to_bowl_5
python scripts/validate_training_config.py --check-memory --gpus 8 --batch-size 640
python scripts/validate_training_config.py --generate-command --dataset demo_data/cube_to_bowl_5 --embodiment unitree_g1
python scripts/embodiment_config_reference.py --show unitree_g1
bash -n gr00t/eval/sim/robocasa-gr1-tabletop-tasks/setup_RoboCasaGR1TabletopTasks.sh

Adds a standalone, well-documented example demonstrating core GR00T policy usage without requiring GPU hardware. Perfect for users getting started with GR00T or integrating into custom projects. Key features: - Load pre-trained GR00T policy from HuggingFace Hub - Prepare observations (images + proprioception) - Get action predictions with error handling - CPU-only execution for accessibility - Comprehensive docstrings and step-by-step comments Includes: - inference_minimal.py: Annotated example code (158 lines) - README.md: Complete usage guide (315 lines) * Quick start with installation steps * Detailed explanation of each phase * How to extend for real robots * Common issues and troubleshooting * Performance optimization tips - requirements.txt: Core dependencies This addresses common community questions: 1. 'How do I use GR00T policy for inference?' 2. 'Can I test without a GPU?' 3. 'How do I integrate GR00T into my project?' Follows established patterns in examples/robocasa, examples/LIBERO, examples/SimplerEnv, ensuring consistency with existing codebase. Signed-off-by: David <35550068+dcol91863@users.noreply.github.com>

Signed-off-by: David <35550068+dcol91863@users.noreply.github.com>

- Add validate_dataset.py: Comprehensive validation tool for GR00T LeRobot format datasets * Validates directory structure (meta, videos, data) * Checks metadata files (modality.json, episodes.jsonl, tasks.jsonl, info.json) * Verifies modality configuration integrity * Validates parquet file structure and content * Checks video file presence and naming conventions * Calculates dataset statistics (episodes, frames, size) * Provides detailed error and warning reports - Add inspect_dataset.py: Detailed inspection and analysis tool * Provides comprehensive dataset structure overview * Analyzes metadata (episodes, tasks, info) * Inspects modality configuration in detail * Calculates data statistics (frames, size, file counts) * Infers embodiment type hints based on action/state dimensions * Supports JSON report export for documentation - Add README_DATASET_TOOLS.md: Complete documentation * Usage instructions for both tools * Feature descriptions with examples * Common workflows and troubleshooting guide * Data format reference These tools help users validate and understand their datasets before training, ensuring compliance with the GR00T LeRobot format specification.

- Add embodiment_config_reference.py: Comprehensive tool for understanding robot embodiments * List all available embodiments (pre-trained and post-training) * Show detailed configuration for specific embodiments * Display state/action dimensions and modality keys * View action configurations (RELATIVE/ABSOLUTE, EEF/NON_EEF) * Generate configuration templates for new robots * Validate custom configuration files * Summary table with dimensions for all embodiments Features: * --list: List all embodiments with configuration status * --all: Show summary table with state/action dims and video counts * --show <embodiment>: Display detailed configuration * --template <name>: Generate template for custom robot * --validate <file>: Validate configuration file syntax This tool helps users: - Understand what embodiments are available and their specs - Debug configuration issues - Create custom embodiment configurations - Compare different robots' configurations - Update README_DATASET_TOOLS.md with complete documentation * Usage examples for all commands * Sample outputs showing table format and details * Common workflows for different use cases * Integration with other dataset tools

- Add validate_training_config.py: Pre-training validation and optimization tool * Validate dataset structure and metadata completeness * Check embodiment configuration compatibility * Validate hyperparameter ranges and values * Estimate GPU memory requirements (model + optimizer + batch) * Suggest optimal hyperparameters based on dataset size * Calculate per-GPU batch sizes for distributed training * Provide GPU type recommendations Features: * Dataset validation: directory structure, metadata files, episode count * Embodiment validation: tag existence, modality config availability * Hyperparameter validation: batch size, learning rate, warmup, weight decay * Memory analysis: Model (6GB) + Optimizer (12GB AdamW) + per-sample costs * Smart suggestions for batch size, learning rate, and max steps * Per-GPU batch size calculation for multi-GPU training Helps users: - Catch configuration errors before training - Optimize batch size and learning rate for their dataset - Estimate GPU memory needs - Validate dataset before expensive training runs - Get rapid feedback on configuration parameters - Update README_DATASET_TOOLS.md with complete documentation * Full usage examples with different scenarios * Sample output showing all validation checks * GPU memory breakdown explanations * Common use cases and workflows * Integration with finetuning pipeline

Signed-off-by: David <35550068+dcol91863@users.noreply.github.com>

dcol91863 · 2026-03-01T13:08:46Z

Added a follow-up fix for #408 on this branch.

What changed:

clarified that GR00T custom-embodiment training expects LeRobot v2.1-style metadata/files
documented the v3-to-v2 conversion step for hosted LeRobot datasets
added explicit troubleshooting for missing meta/episodes.jsonl and KeyError: 'chunk_index'
updated the SO100 example to validate the converted dataset before training
added the missing jsonlines dependency required by scripts/lerobot_conversion/convert_v3_to_v2.py

This is intended to make the custom-embodiment path actionable for users starting from current Hugging Face LeRobot datasets.

Signed-off-by: David <35550068+dcol91863@users.noreply.github.com>

dcol91863 · 2026-03-01T13:11:58Z

Added a small follow-up for #551 on this branch.

The RoboCasa GR1 setup script was installing robosuite@master, which is brittle when upstream changes land. It now defaults to v1.5.1 and allows overrides via ROBOSUITE_REF for users who need a different tag or commit.

dcol91863 added 6 commits February 25, 2026 11:22

docs: add minimal CPU inference quick-start to README

ba134ea

Signed-off-by: David <35550068+dcol91863@users.noreply.github.com>

scripts: align validation tooling docs and CLI

c20cbf8

Signed-off-by: David <35550068+dcol91863@users.noreply.github.com>

dcol91863 changed the title ~~Add minimal GR00T inference example (no GPU required)~~ scripts: align validation tooling docs and CLI Mar 1, 2026

docs: clarify LeRobot v3 conversion for custom embodiments

e4ca33c

Signed-off-by: David <35550068+dcol91863@users.noreply.github.com>

dcol91863 changed the title ~~scripts: align validation tooling docs and CLI~~ docs: clarify LeRobot conversion and validation workflows Mar 1, 2026

sim: pin robosuite setup to a stable release

3ed8da2

Signed-off-by: David <35550068+dcol91863@users.noreply.github.com>

dcol91863 changed the title ~~docs: clarify LeRobot conversion and validation workflows~~ docs and sim: improve setup and validation workflows Mar 1, 2026

j3soon mentioned this pull request Mar 2, 2026

Pin robosuite to v1.5.1 #564

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs and sim: improve setup and validation workflows#559

docs and sim: improve setup and validation workflows#559
dcol91863 wants to merge 8 commits intoNVIDIA:mainfrom
dcol91863:feature/minimal-inference-example

dcol91863 commented Feb 25, 2026 •

edited

Loading

Uh oh!

dcol91863 commented Mar 1, 2026

Uh oh!

dcol91863 commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dcol91863 commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

dcol91863 commented Mar 1, 2026

Uh oh!

dcol91863 commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dcol91863 commented Feb 25, 2026 •

edited

Loading