Add log manager to track experiement logs by pan-x-c · Pull Request #507 · agentscope-ai/Trinity-RFT

pan-x-c · 2026-02-11T11:09:37Z

Description

Using trinity log to track the logs during training.

Checklist

Please check the following items before code is ready to be reviewed.

Code has passed all tests
Docstrings have been added/updated in Google Style
Documentation has been updated
Code is ready for review

gemini-code-assist · 2026-02-11T11:09:53Z

Summary of Changes

Hello @pan-x-c, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the trinity CLI by adding a powerful new log command for real-time monitoring of experiment logs. This new utility allows users to easily track training progress and debug issues by providing flexible options for filtering, searching, and displaying log data. The changes also include a minor refactoring in the configuration handling to streamline directory path generation.

Highlights

New CLI Command: Introduced a new trinity log command for real-time monitoring of experiment logs, offering options for directory specification, configuration loading, keyword filtering, log level control, displaying last N lines, search patterns, and color output.
Log Management System: Added a comprehensive LogManager and LogFileTracker system to efficiently handle multiple log files, including features like file rotation detection, log level parsing, and continuous monitoring.
Configuration Refactoring: Refactored the Config class to include a dedicated method get_checkpoint_job_dir, centralizing the logic for determining checkpoint job directory paths and improving code reusability.

Changelog

trinity/cli/launcher.py
- Added a new log command to the typer application, enabling real-time log monitoring with various configurable options.
trinity/common/config.py
- Introduced a new method get_checkpoint_job_dir to the Config class for consistent retrieval of checkpoint job directory paths.
trinity/common/config_validator.py
- Updated the validate method to utilize the newly added get_checkpoint_job_dir method for setting the checkpoint job directory.
trinity/manager/log_manager.py
- Added a new module containing LogManager and LogFileTracker classes, providing core functionality for real-time log file tracking, parsing, and display.

Activity

No human activity has been recorded on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new trinity log command to monitor experiment logs in real-time. The core logic is encapsulated in a new LogManager. The changes also include a small refactoring in the Config class to centralize the construction of the checkpoint job directory path.

My review focuses on improving robustness and performance. I've identified a potential high-memory usage issue in the log searching feature, a case of silent exception handling that could hide bugs, and an opportunity to simplify some redundant code in the new CLI command. Overall, this is a great addition for improving observability during training.

trinity/manager/log_manager.py

trinity/cli/launcher.py

trinity/manager/log_manager.py

pan-x-c · 2026-02-12T04:22:58Z

/unittest-diff

github-actions · 2026-02-12T04:59:36Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
80	79	0	1	0	0	34m 17s

Skipped

Tests	Status
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async	skipped ⏭️

Tests

Test Name	Status	Duration
tests/cli/launcher_test.py::TestLauncherMain::test_debug_mode	✅	47.7s
tests/cli/launcher_test.py::TestLauncherMain::test_log_mode	✅	424ms
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_command	✅	6.5s
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_in_dlc	✅	1.6s
tests/cli/launcher_test.py::TestLauncherMain::test_main_studio_command	✅	911ms
tests/cli/launcher_test.py::TestLauncherMain::test_multi_stage_run	✅	15.0s
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	32.4s
tests/common/config_test.py::TestConfig::test_chat_template_path	✅	342ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	32ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	385ms
tests/common/config_test.py::TestConfig::test_default_workflow	✅	298ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	5.4s
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly	✅	317ms
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation	✅	310ms
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	376ms
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	16ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/sudoku_test.py::test_9x9_generator_produces_valid_solution	✅	1ms
tests/common/sudoku_test.py::test_9x9_generator_creates_holes	✅	1ms
tests/common/sudoku_test.py::test_9x9_solution_is_fully_filled	✅	1ms
tests/common/sudoku_test.py::test_judge_allows_incomplete_board	✅	1ms
tests/common/sudoku_test.py::test_judge_detects_row_violation	✅	1ms
tests/common/sudoku_test.py::test_judge_detects_column_violation	✅	1ms
tests/common/sudoku_test.py::test_judge_detects_block_violation	✅	1ms
tests/common/sudoku_test.py::test_4x4_generator_produces_valid_solution	✅	1ms
tests/common/sudoku_test.py::test_4x4_solution_is_fully_filled	✅	1ms
tests/common/sudoku_test.py::test_4x4_judge_detects_row_violation	✅	1ms
tests/common/sudoku_test.py::test_4x4_judge_detects_block_violation	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	56.8s
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	39.6s
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	38.4s
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	27.7s
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	32.1s
tests/common/vllm_test.py::TestModelLen_2::test_model_len	✅	27.6s
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len	✅	25.5s
tests/common/vllm_test.py::TestMessageProcess::test_no_prompt_truncation	✅	26.7s
tests/common/vllm_test.py::TestMessageProcess::test_truncation_status	✅	27.1s
tests/common/vllm_test.py::TestAPIServer::test_api	✅	28.7s
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api	✅	27.2s
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	28.4s
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async	⏭️	1ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	244ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	229ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	31.8s
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	28.5s
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate	✅	2m 39s
tests/common/vllm_test.py::TestTinkerAPI::test_tinker_api	✅	41.0s
tests/manager/log_manager_test.py::TestLogManager::test_file_rotation	✅	2ms
tests/manager/log_manager_test.py::TestLogManager::test_init_and_tracking	✅	1ms
tests/manager/log_manager_test.py::TestLogManager::test_keyword_filter_and_search_pattern	✅	1ms
tests/manager/synchronizer_test.py::TestSynchronizerExit_0::test_synchronizer	✅	2m 27s
tests/manager/synchronizer_test.py::TestSynchronizerExit_1::test_synchronizer	✅	2m 28s
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_0::test_synchronizer	✅	2m 3s
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_1::test_synchronizer	✅	1m 40s
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_2::test_synchronizer	✅	2m 1s
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_3::test_synchronizer	✅	2m 37s
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_4::test_synchronizer	✅	2m 19s
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_5::test_synchronizer	✅	2m 39s
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_0::test_synchronizer	✅	1m 8s
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_1::test_synchronizer	✅	1m 2s
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_2::test_synchronizer	✅	1m 3s
tests/manager/synchronizer_test.py::TestPullLatestWeights::test_no_new_version_logs_warning	✅	4ms
tests/manager/synchronizer_test.py::TestPullLatestWeights::test_pull_latest_weights_0	✅	2ms
tests/manager/synchronizer_test.py::TestPullLatestWeights::test_pull_latest_weights_1	✅	3ms
tests/manager/synchronizer_test.py::TestPullLatestWeights::test_pull_latest_weights_2	✅	2ms
tests/manager/synchronizer_test.py::TestPullLatestWeights::test_pull_latest_weights_3	✅	2ms

Github Test Reporter by CTRF 💚

pan-x-c · 2026-02-12T05:00:43Z

/unittest-all

pan-x-c · 2026-02-12T06:36:32Z

/unittest-trainer

pan-x-c · 2026-02-12T06:37:13Z

/unittest-module-trainer

pan-x-c · 2026-02-12T07:10:54Z

/unittest-module-trainer

github-actions · 2026-02-12T08:08:30Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
27	24	0	3	0	0	46m 57s

Skipped

Tests	Status
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer	skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class	skipped ⏭️

Tests

Test Name	Status	Duration
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer	✅	3m 57s
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer	✅	5m 1s
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer	✅	1m 44s
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer	✅	1m 15s
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer	✅	1m 1s
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer	✅	1m 6s
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer	✅	1m 12s
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	⏭️	1ms
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer	✅	33.9s
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer	✅	32.1s
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools	✅	29.4s
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode	✅	1m 37s
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode	✅	1m 36s
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode	✅	2m 24s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_0_fsdp::test_trainer	✅	2m 53s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_1_megatron::test_trainer	✅	5m 45s
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer	✅	1m 53s
tests/trainer/trainer_test.py::TestServeWithTrainer::test_serve_with_trainer	✅	1m 48s
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	✅	2m 39s
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	✅	1m 8s
tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer	✅	3m 7s
tests/trainer/trainer_test.py::TestOverRollout::test_trainer	✅	1m 11s
tests/trainer/trainer_test.py::TestTrainerPromptTruncation::test_trainer	✅	45.2s
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer	⏭️	1ms
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class	⏭️	1ms
tests/trainer/trainer_test.py::AgentScopeTunerTest::test_agentscope_tuner	✅	1m 12s
tests/trainer/trainer_test.py::ColocateModeTest::test_trainer	✅	1m 55s

Github Test Reporter by CTRF 💚

pan-x-c · 2026-02-12T08:10:28Z

/unittest-module-explorer

github-actions · 2026-02-12T08:26:12Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
49	49	0	0	0	0	13m 20s

Tests

Test Name	Status	Duration
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer	✅	1m 48s
tests/explorer/explorer_test.py::TestExplorerEvalDetailedStats::test_explorer	✅	1m 12s
tests/explorer/explorer_test.py::TestExplorerGSM8KRULERNoEval::test_explorer	✅	1m 6s
tests/explorer/explorer_test.py::TestExplorerGSM8k::test_explorer	✅	3m
tests/explorer/explorer_test.py::ServeTest::test_serve	✅	54.6s
tests/explorer/proxy_test.py::RecorderTest::test_recorder	✅	62ms
tests/explorer/scheduler_test.py::SchedulerTest::test_async_workflow	✅	5.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations	✅	4.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_dynamic_timeout	✅	13.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results	✅	29.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_0	✅	4.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_1	✅	4.7s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_0	✅	4.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_1	✅	4.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_multi_step_execution	✅	5.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_non_repeatable_workflow	✅	4.9s
tests/explorer/scheduler_test.py::SchedulerTest::test_over_rollout_min_wait	✅	12.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods	✅	14.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop	✅	8.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks	✅	8.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_stepwise_experience_eid	✅	25.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all	✅	7.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch	✅	13.6s
tests/explorer/scheduler_test.py::TestRunnerStateCollection::test_runner_state_collection	✅	9.7s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_0	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_1	✅	602ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_0	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_1	✅	1.0s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_raise_error	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_stop_at_max_env_steps	✅	1.0s
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow	✅	28ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow	✅	17ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow	✅	131ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_eval_workflow	✅	3ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow	✅	11ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow	✅	7ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_0	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_1	✅	101ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_0	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_1	✅	201ms
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_0::test_multi_turn_workflow	✅	23.4s
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_1::test_multi_turn_workflow	✅	23.6s
tests/explorer/workflow_test.py::TestWorkflowStateRecording::test_workflow_state_recording	✅	4.0s
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v0	✅	785ms
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v1	✅	15ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner	✅	137ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner_get_state	✅	8.0s
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_with_openai	✅	26.6s
tests/explorer/workflow_test.py::TestConcurrentWorkflowRunner::test_concurrent_workflow_runner	✅	44.6s

Github Test Reporter by CTRF 💚

pan-x-c · 2026-02-12T08:31:19Z

/unittest-module-buffer

pan-x-c · 2026-02-12T08:31:38Z

/unittest-module-algorithm

github-actions · 2026-02-12T08:35:22Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
48	48	0	0	0	0	1m 42s

Tests

Test Name	Status	Duration
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_experience_pipeline	✅	10.6s
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_pass_rate_calculation	✅	6.0s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_experience_buffer	✅	2.4s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_0_sft	✅	4.3s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_1_dpo	✅	4.8s
tests/buffer/file_test.py::TestFileBuffer::test_file_reader	✅	434ms
tests/buffer/file_test.py::TestFileBuffer::test_file_writer	✅	1.5s
tests/buffer/formatter_test.py::TestFormatter::test_dpo_messages_formatter	✅	557ms
tests/buffer/formatter_test.py::TestFormatter::test_dpo_plaintext_formatter	✅	481ms
tests/buffer/formatter_test.py::TestFormatter::test_multi_modal_sft_formatter	✅	841ms
tests/buffer/formatter_test.py::TestFormatter::test_sft_messages_formatter	✅	997ms
tests/buffer/formatter_test.py::TestFormatter::test_sft_plaintext_formatter	✅	766ms
tests/buffer/formatter_test.py::TestFormatter::test_task_formatter	✅	229ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_buffer_reuse	✅	6.3s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_capacity	✅	2.1s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_reuse_count_control	✅	4.1s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_0_queue	✅	3.3s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_1_priority_queue	✅	3.1s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_capacity	✅	3.8s
tests/buffer/reader_test.py::TestBufferReader::test_buffer_reader_registration	✅	752ms
tests/buffer/reward_shaping_mapper_test.py::TestRewardShapingMapper::test_basic_usage	✅	6ms
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_default_queue_default_sample_strategy	✅	1.8s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_default_queue_staleness_control_sample_strategy	✅	1.5s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_priority_queue_default_sample_strategy	✅	1.6s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_priority_queue_staleness_control_sample_strategy	✅	1.7s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_sql_staleness_control_sample_strategy	✅	4.3s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_default_queue_default_sample_strategy	✅	2.0s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_default_queue_staleness_control_sample_strategy	✅	1.7s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_priority_queue_default_sample_strategy	✅	1.7s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_priority_queue_staleness_control_sample_strategy	✅	1.6s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_sql_staleness_control_sample_strategy	✅	3.4s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_exp_buffer_read_write_0	✅	5.5s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_exp_buffer_read_write_1	✅	2.0s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_task_buffer_read_write	✅	2.7s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_0	✅	72ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_1	✅	58ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_2	✅	90ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_3	✅	90ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_4	✅	92ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_5	✅	93ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_6	✅	107ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_simple	✅	46ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_0_file	✅	288ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_1_sql	✅	2.8s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_2_file	✅	41ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_3_sql	✅	2.6s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_4_file	✅	41ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_5_sql	✅	3.2s

Github Test Reporter by CTRF 💚

github-actions · 2026-02-12T08:37:53Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
27	27	0	0	0	0	3.1s

Tests

Test Name	Status	Duration
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_batch_level_std_grpo	✅	6ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_batch_level_step_wise_grpo_advantage	✅	3ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_duplicate_grpo	✅	4ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_advantage	✅	3ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_correct_bias	✅	2ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_reward_std	✅	1ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_step_wise_grpo_advantage	✅	1ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_step_wise_grpo_with_std_threshold	✅	2ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_abs_kl_fn	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_fallback	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_loss	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_same_policy	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_with_old_logprob	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_dummy_kl_fn	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_k1_kl_fn	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_k2_kl_fn	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_k3_kl_fn	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_kl_loss_aggregation_modes	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_low_var_kl_fn	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_dpo_policy_loss	✅	2ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_gspo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_mix_policy_loss	✅	3ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_opmd_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_ppo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_ppo_policy_loss_with_sequence_masking	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sapo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sft_policy_loss	✅	1ms

Github Test Reporter by CTRF 💚

pan-x-c · 2026-02-12T09:38:20Z

/unittest-module-common

github-actions · 2026-02-12T09:52:24Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
55	54	0	1	0	0	11m 36s

Skipped

Tests	Status
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async	skipped ⏭️

Tests

Test Name	Status	Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	22.7s
tests/common/config_test.py::TestConfig::test_chat_template_path	✅	77ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	32ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	161ms
tests/common/config_test.py::TestConfig::test_default_workflow	✅	76ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	1.2s
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly	✅	76ms
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation	✅	76ms
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	1.7s
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	14ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/sudoku_test.py::test_9x9_generator_produces_valid_solution	✅	1ms
tests/common/sudoku_test.py::test_9x9_generator_creates_holes	✅	1ms
tests/common/sudoku_test.py::test_9x9_solution_is_fully_filled	✅	1ms
tests/common/sudoku_test.py::test_judge_allows_incomplete_board	✅	1ms
tests/common/sudoku_test.py::test_judge_detects_row_violation	✅	1ms
tests/common/sudoku_test.py::test_judge_detects_column_violation	✅	1ms
tests/common/sudoku_test.py::test_judge_detects_block_violation	✅	1ms
tests/common/sudoku_test.py::test_4x4_generator_produces_valid_solution	✅	1ms
tests/common/sudoku_test.py::test_4x4_solution_is_fully_filled	✅	1ms
tests/common/sudoku_test.py::test_4x4_judge_detects_row_violation	✅	1ms
tests/common/sudoku_test.py::test_4x4_judge_detects_block_violation	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	57.6s
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	39.8s
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	46.6s
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	32.5s
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	26.2s
tests/common/vllm_test.py::TestModelLen_2::test_model_len	✅	26.7s
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len	✅	26.9s
tests/common/vllm_test.py::TestMessageProcess::test_no_prompt_truncation	✅	25.1s
tests/common/vllm_test.py::TestMessageProcess::test_truncation_status	✅	27.2s
tests/common/vllm_test.py::TestAPIServer::test_api	✅	29.9s
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api	✅	26.2s
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	28.1s
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async	⏭️	1ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	239ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	254ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	32.1s
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	31.1s
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate	✅	2m 49s
tests/common/vllm_test.py::TestTinkerAPI::test_tinker_api	✅	40.0s

Github Test Reporter by CTRF 💚

pan-x-c added 3 commits February 11, 2026 14:50

add log monitor

2d463ab

add log monitor

4641738

Merge branch 'main' into feature/log_manager

3092f73

gemini-code-assist bot reviewed Feb 11, 2026

View reviewed changes

trinity/manager/log_manager.py Show resolved Hide resolved

trinity/cli/launcher.py Show resolved Hide resolved

trinity/manager/log_manager.py Outdated Show resolved Hide resolved

pan-x-c added 8 commits February 11, 2026 19:15

fix comments

45e35da

update command

ff323d5

add logger doc

310fe56

fix pre-commit

add5a1e

add tests

9234538

add log manager tests

beacf28

pre-release 0.5.1

ef21f45

fix typo

d47ee15

Merge branch 'main' into feature/log_manager

19031f9

fix tests

25c3a68

pan-x-c changed the title ~~Add log manager to tracker experiement logs~~ Add log manager to track experiement logs Feb 12, 2026

skip unstable tests

514dd7c

Merge branch 'main' into feature/log_manager

aa94f2f

reduce min token len

ff1e579

yanxi-chen approved these changes Feb 12, 2026

View reviewed changes

yanxi-chen merged commit c3d356c into agentscope-ai:main Feb 12, 2026
2 checks passed

Conversation

pan-x-c commented Feb 11, 2026

Description

Checklist

Uh oh!

gemini-code-assist bot commented Feb 11, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pan-x-c commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 12, 2026

Summary

Skipped

Tests

Uh oh!

pan-x-c commented Feb 12, 2026

Uh oh!

pan-x-c commented Feb 12, 2026

Uh oh!

pan-x-c commented Feb 12, 2026

Uh oh!

pan-x-c commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 12, 2026

Summary

Skipped

Tests

Uh oh!

pan-x-c commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 12, 2026

Summary

Tests

Uh oh!

pan-x-c commented Feb 12, 2026

Uh oh!

pan-x-c commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 12, 2026

Summary

Tests

Uh oh!

github-actions bot commented Feb 12, 2026

Summary

Tests

Uh oh!

pan-x-c commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 12, 2026

Summary

Skipped

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants