Running vibevoice-cli on Windows always fails with ModelLoader: read failed for tensor lm.blk.0.attn_norm.weight immediately after the audio is loaded. The build itself succeeds without errors, only inference fails.
The issue reproduces with both vibevoice-asr-q4_k.gguf and vibevoice-asr-q8_0.gguf from the official mudler/vibevoice.cpp-models HF repository.
Reproduce
- Clone the repository and check out
v0.1
- Build on Windows with MSYS2 / MinGW-w64 UCRT64:
cmake -B build -G "Ninja" -DBUILD_SHARED_LIBS=OFF -DCMAKE_BUILD_TYPE=Release -DGGML_USE_MMAP=OFF
cmake --build build -j
- Copy required DLLs next to the executable:
copy C:\msys64\ucrt64\bin\libwinpthread-1.dll build\bin\
copy C:\msys64\ucrt64\bin\libgcc_s_seh-1.dll build\bin\
copy C:\msys64\ucrt64\bin\libstdc++-6.dll build\bin\
copy C:\msys64\ucrt64\bin\libgomp-1.dll build\bin\
- Download models:
hf download mudler/vibevoice.cpp-models --local-dir models
- Run inference:
.\build\bin\vibevoice-cli.exe asr --model ..\models\vibevoice-asr-q4_k.gguf --tokenizer ..\models\tokenizer.gguf --audio "audio.wav" --max-new-tokens 8192
Expected
The model loads successfully and transcribes the audio.
Actual
asr: loaded 142636 samples (5.94s)
[vv I] backend: CPU
[vv E] ModelLoader: read failed for tensor lm.blk.0.attn_norm.weight
asr: failed to load model
Environment
|
|
| OS |
Windows 11 |
| Compiler |
GCC 16.1.0 (MSYS2 UCRT64, MinGW-w64) |
| Branches tested |
master (commit ad856bd) and v0.1 tag (commit 8ffffa1) |
| ggml version |
0.10.0 (commit 8be60f83) |
| Model |
vibevoice-asr-q4_k.gguf (10,392,063,296 bytes) |
| Model source |
hf download mudler/vibevoice.cpp-models |
Additional context
- The error occurs at the very first tensor (
lm.blk.0.attn_norm.weight), which suggests a GGUF format mismatch between the uploaded HF models and the current model_loader.cpp parsing logic, or a Windows-specific file reading issue.
- Disabling mmap (
-DGGML_USE_MMAP=OFF) does not resolve the issue.
- Building from both the latest
master and the v0.1 tag (which was tagged at the same time the models were uploaded to HF) produces the same error.
- The build completes without errors. The CLI starts correctly and loads the audio file successfully, only model loading fails.
Has this been tested on Windows by the maintainers? Is there a known workaround or a specific build configuration required for Windows?
Running
vibevoice-clion Windows always fails withModelLoader: read failed for tensor lm.blk.0.attn_norm.weightimmediately after the audio is loaded. The build itself succeeds without errors, only inference fails.The issue reproduces with both
vibevoice-asr-q4_k.ggufandvibevoice-asr-q8_0.gguffrom the officialmudler/vibevoice.cpp-modelsHF repository.Reproduce
v0.1Expected
The model loads successfully and transcribes the audio.
Actual
Environment
master(commitad856bd) andv0.1tag (commit8ffffa1)8be60f83)vibevoice-asr-q4_k.gguf(10,392,063,296 bytes)hf download mudler/vibevoice.cpp-modelsAdditional context
lm.blk.0.attn_norm.weight), which suggests a GGUF format mismatch between the uploaded HF models and the currentmodel_loader.cppparsing logic, or a Windows-specific file reading issue.-DGGML_USE_MMAP=OFF) does not resolve the issue.masterand thev0.1tag (which was tagged at the same time the models were uploaded to HF) produces the same error.Has this been tested on Windows by the maintainers? Is there a known workaround or a specific build configuration required for Windows?