Skip to content

Conversation

@alessiodevoto
Copy link
Collaborator

Main proposed changes:

  • added a time based trigger for TriggerPhraseLP, so that the sentence can be triggered after N seconds
  • added a MaxTimeLP that exponentially increases the probability of EOS token until a max_time. Once max_time is reached, it stops generation (optionally after completing the sentence)
  • these two can be used together to urge the model to answer and then stop generation

Tested on all notebooks (and trtllm test files).

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
@aerdem4 aerdem4 self-requested a review July 8, 2025 11:25
@aerdem4 aerdem4 merged commit 59b766e into NVIDIA:main Jul 8, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants