Chat template from chat_template.jinja for all possible paths#4055
Chat template from chat_template.jinja for all possible paths#4055dkalinowski wants to merge 3 commits intomainfrom
Conversation
694d391 to
1a7437e
Compare
There was a problem hiding this comment.
Pull request overview
This PR updates the LLM/VLM servable initialization flow to allow overriding the tokenizer chat template from a chat_template.jinja file located in the model path, making that override available across multiple pipeline initializers.
Changes:
- Add logic to detect and read
chat_template.jinjafrom the model path and calltokenizer.set_chat_template(...). - Add
<fstream>include where needed to support reading the template file. - Apply the same override behavior across legacy LM, continuous batching LM, and legacy VLM initializers.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
src/llm/visual_language_model/legacy/servable_initializer.cpp |
Reads chat_template.jinja (if present) after creating the VLM pipeline/tokenizer and applies it to the tokenizer. |
src/llm/language_model/legacy/servable_initializer.cpp |
Reads chat_template.jinja (if present) after creating the legacy LLM pipeline/tokenizer and applies it to the tokenizer. |
src/llm/language_model/continuous_batching/servable_initializer.cpp |
Reads chat_template.jinja (if present) after creating the CB pipeline/tokenizer and applies it to the tokenizer. |
| if (chatTemplateFile.is_open()) { | ||
| std::string chatTemplateContent((std::istreambuf_iterator<char>(chatTemplateFile)), | ||
| std::istreambuf_iterator<char>()); | ||
| if (!chatTemplateContent.empty()) { |
There was a problem hiding this comment.
This block uses std::istreambuf_iterator (and std::filesystem::path), but the file doesn’t include the corresponding standard headers (<iterator> / <filesystem>). Please add the missing includes to avoid relying on transitive headers (can break on different libstdc++/libc++ versions).
|
|
||
| // Override chat template from chat_template.jinja file if present in model directory | ||
| std::filesystem::path chatTemplateJinjaPath = std::filesystem::path(parsedModelsPath) / "chat_template.jinja"; | ||
| if (std::filesystem::exists(chatTemplateJinjaPath)) { | ||
| std::ifstream chatTemplateFile(chatTemplateJinjaPath); | ||
| if (chatTemplateFile.is_open()) { | ||
| std::string chatTemplateContent((std::istreambuf_iterator<char>(chatTemplateFile)), | ||
| std::istreambuf_iterator<char>()); | ||
| if (!chatTemplateContent.empty()) { | ||
| properties->tokenizer.set_chat_template(chatTemplateContent); | ||
| SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Loaded custom chat template from: {}", chatTemplateJinjaPath.string()); | ||
| } | ||
| } else { | ||
| SPDLOG_LOGGER_WARN(llm_calculator_logger, "Failed to open chat template file: {}", chatTemplateJinjaPath.string()); | ||
| } | ||
| } |
There was a problem hiding this comment.
The same “read chat_template.jinja and set tokenizer chat template” logic is duplicated across multiple servable initializers in this PR. To prevent drift (e.g., different logging/edge-case handling), consider extracting it into a shared helper (e.g., on GenAiServableInitializer).
| // Override chat template from chat_template.jinja file if present in model directory | |
| std::filesystem::path chatTemplateJinjaPath = std::filesystem::path(parsedModelsPath) / "chat_template.jinja"; | |
| if (std::filesystem::exists(chatTemplateJinjaPath)) { | |
| std::ifstream chatTemplateFile(chatTemplateJinjaPath); | |
| if (chatTemplateFile.is_open()) { | |
| std::string chatTemplateContent((std::istreambuf_iterator<char>(chatTemplateFile)), | |
| std::istreambuf_iterator<char>()); | |
| if (!chatTemplateContent.empty()) { | |
| properties->tokenizer.set_chat_template(chatTemplateContent); | |
| SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Loaded custom chat template from: {}", chatTemplateJinjaPath.string()); | |
| } | |
| } else { | |
| SPDLOG_LOGGER_WARN(llm_calculator_logger, "Failed to open chat template file: {}", chatTemplateJinjaPath.string()); | |
| } | |
| } |
| properties->tokenizer = properties->pipeline->get_tokenizer(); | ||
|
|
||
| // Override chat template from chat_template.jinja file if present in model directory | ||
| std::filesystem::path chatTemplateJinjaPath = std::filesystem::path(parsedModelsPath) / "chat_template.jinja"; |
There was a problem hiding this comment.
chatTemplateJinjaPath is derived as parsedModelsPath / "chat_template.jinja". Since parseModelsPath() accepts .gguf files as a valid models_path, this won’t locate a template stored next to a GGUF file (it will incorrectly search under <file>.gguf/chat_template.jinja). Consider basing the search directory on parsedModelsPath if it’s a directory, otherwise use parent_path().
| std::filesystem::path chatTemplateJinjaPath = std::filesystem::path(parsedModelsPath) / "chat_template.jinja"; | |
| std::filesystem::path modelsPathFs(parsedModelsPath); | |
| std::filesystem::path chatTemplateDir = modelsPathFs; | |
| if (!chatTemplateDir.empty() && !std::filesystem::is_directory(chatTemplateDir)) { | |
| chatTemplateDir = chatTemplateDir.parent_path(); | |
| } | |
| std::filesystem::path chatTemplateJinjaPath = chatTemplateDir / "chat_template.jinja"; |
| if (chatTemplateFile.is_open()) { | ||
| std::string chatTemplateContent((std::istreambuf_iterator<char>(chatTemplateFile)), | ||
| std::istreambuf_iterator<char>()); | ||
| if (!chatTemplateContent.empty()) { |
There was a problem hiding this comment.
This code uses std::istreambuf_iterator, but there is no explicit #include <iterator> in this translation unit. Please add it to ensure the file compiles without relying on indirect includes.
| std::filesystem::path chatTemplateJinjaPath = std::filesystem::path(parsedModelsPath) / "chat_template.jinja"; | ||
| if (std::filesystem::exists(chatTemplateJinjaPath)) { | ||
| std::ifstream chatTemplateFile(chatTemplateJinjaPath); | ||
| if (chatTemplateFile.is_open()) { | ||
| std::string chatTemplateContent((std::istreambuf_iterator<char>(chatTemplateFile)), | ||
| std::istreambuf_iterator<char>()); | ||
| if (!chatTemplateContent.empty()) { |
There was a problem hiding this comment.
This block uses std::istreambuf_iterator / std::filesystem::path, but the file doesn’t include the corresponding standard headers (<iterator> / <filesystem>). Please add the missing includes to avoid relying on transitive headers.
| properties->tokenizer = properties->pipeline->get_tokenizer(); | ||
|
|
||
| // Override chat template from chat_template.jinja file if present in model directory | ||
| std::filesystem::path chatTemplateJinjaPath = std::filesystem::path(parsedModelsPath) / "chat_template.jinja"; |
There was a problem hiding this comment.
chatTemplateJinjaPath is built by appending chat_template.jinja to parsedModelsPath. However parseModelsPath() allows models_path to be either a directory or a .gguf file; when it’s a file, this will look for <file>.gguf/chat_template.jinja and never find the template next to the model. Consider resolving the “template directory” as parsedModelsPath when it’s a directory, otherwise parent_path() (and use that for the lookup).
| std::filesystem::path chatTemplateJinjaPath = std::filesystem::path(parsedModelsPath) / "chat_template.jinja"; | |
| std::filesystem::path modelsPathFs(parsedModelsPath); | |
| std::filesystem::path chatTemplateDir = | |
| std::filesystem::is_directory(modelsPathFs) ? modelsPathFs : modelsPathFs.parent_path(); | |
| std::filesystem::path chatTemplateJinjaPath = chatTemplateDir / "chat_template.jinja"; |
No description provided.