Skip to content

Quantization model support? #25

@chrisspen

Description

@chrisspen

Thanks for writing this. Seems to work well.

However, is there a way to use quantized models with this? I see they aren't included in your MODELS dictionary.

If I hot-patch it like:

from whispercpp import Whisper, MODELS
MODELS['ggml-large-q5_0.bin'] = 'https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-q5_0.bin'
w = Whisper('large-q5_0')

that makes it download the model, but when it tries to load the model, it fails with:

Downloading ggml-large-q5_0.bin...
whisper_init_from_file_no_state: loading model from '/home/chris/.ggml-models/ggml-large-q5_0.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1280
whisper_model_load: n_text_head   = 20
whisper_model_load: n_text_layer  = 32
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1008
whisper_model_load: type          = 5
whisper_model_load: mem required  = 3342.00 MB (+   71.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx     = 2950.97 MB
��isper_model_load: unknown tensor '0y��q
  K.7��eţ�k�ؠ��	�͠0@.g7�D��^#��9]�|���N(f�fm����:�@lc��QwO�oezg{��-!�Ě���' in model file
whisper_init_no_state: failed to load model

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions