ModelLoader: read failed for tensor lm.blk.0.attn_norm.weight with ASR q4_k.gguf

Running `vibevoice-cli` on Windows always fails with `ModelLoader: read failed for tensor lm.blk.0.attn_norm.weight` immediately after the audio is loaded. The build itself succeeds without errors, only inference fails. 
The issue reproduces with both `vibevoice-asr-q4_k.gguf` and `vibevoice-asr-q8_0.gguf` from the official `mudler/vibevoice.cpp-models` HF repository.

**Reproduce**

1. Clone the repository and check out `v0.1`
2. Build on Windows with MSYS2 / MinGW-w64 UCRT64:
   ```
   cmake -B build -G "Ninja" -DBUILD_SHARED_LIBS=OFF -DCMAKE_BUILD_TYPE=Release -DGGML_USE_MMAP=OFF
   cmake --build build -j
   ```
3. Copy required DLLs next to the executable:
   ```
   copy C:\msys64\ucrt64\bin\libwinpthread-1.dll build\bin\
   copy C:\msys64\ucrt64\bin\libgcc_s_seh-1.dll build\bin\
   copy C:\msys64\ucrt64\bin\libstdc++-6.dll build\bin\
   copy C:\msys64\ucrt64\bin\libgomp-1.dll build\bin\
   ```
4. Download models:
   ```
   hf download mudler/vibevoice.cpp-models --local-dir models
   ```
5. Run inference:
   ```
   .\build\bin\vibevoice-cli.exe asr --model ..\models\vibevoice-asr-q4_k.gguf --tokenizer ..\models\tokenizer.gguf --audio "audio.wav" --max-new-tokens 8192
   ```

**Expected**

The model loads successfully and transcribes the audio.

**Actual**

```
asr: loaded 142636 samples (5.94s)
[vv I] backend: CPU
[vv E] ModelLoader: read failed for tensor lm.blk.0.attn_norm.weight
asr: failed to load model
```



**Environment**

| | |
|---|---|
| OS | Windows 11 |
| Compiler | GCC 16.1.0 (MSYS2 UCRT64, MinGW-w64) |
| Branches tested | `master` (commit `ad856bd`) and `v0.1` tag (commit `8ffffa1`) |
| ggml version | 0.10.0 (commit `8be60f83`) |
| Model | `vibevoice-asr-q4_k.gguf` (10,392,063,296 bytes) |
| Model source | `hf download mudler/vibevoice.cpp-models` |



**Additional context**

- The error occurs at the very first tensor (`lm.blk.0.attn_norm.weight`), which suggests a GGUF format mismatch between the uploaded HF models and the current `model_loader.cpp` parsing logic, or a Windows-specific file reading issue.
- Disabling mmap (`-DGGML_USE_MMAP=OFF`) does not resolve the issue.
- Building from both the latest `master` and the `v0.1` tag (which was tagged at the same time the models were uploaded to HF) produces the same error.
- The build completes without errors. The CLI starts correctly and loads the audio file successfully, only model loading fails.

Has this been tested on Windows by the maintainers? Is there a known workaround or a specific build configuration required for Windows?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ModelLoader: read failed for tensor lm.blk.0.attn_norm.weight with ASR q4_k.gguf #7

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development


OS	Windows 11
Compiler	GCC 16.1.0 (MSYS2 UCRT64, MinGW-w64)
Branches tested	`master` (commit `ad856bd`) and `v0.1` tag (commit `8ffffa1`)
ggml version	0.10.0 (commit `8be60f83`)
Model	`vibevoice-asr-q4_k.gguf` (10,392,063,296 bytes)
Model source	`hf download mudler/vibevoice.cpp-models`

Uh oh!

ModelLoader: read failed for tensor lm.blk.0.attn_norm.weight with ASR q4_k.gguf #7

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions