Quantization model support?

Thanks for writing this. Seems to work well.

However, is there a way to use quantized models with this? I see they aren't included in your MODELS dictionary.

If I hot-patch it like:

    from whispercpp import Whisper, MODELS
    MODELS['ggml-large-q5_0.bin'] = 'https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-q5_0.bin'
    w = Whisper('large-q5_0')

that makes it download the model, but when it tries to load the model, it fails with:

    Downloading ggml-large-q5_0.bin...
    whisper_init_from_file_no_state: loading model from '/home/chris/.ggml-models/ggml-large-q5_0.bin'
    whisper_model_load: loading model
    whisper_model_load: n_vocab       = 51865
    whisper_model_load: n_audio_ctx   = 1500
    whisper_model_load: n_audio_state = 1280
    whisper_model_load: n_audio_head  = 20
    whisper_model_load: n_audio_layer = 32
    whisper_model_load: n_text_ctx    = 448
    whisper_model_load: n_text_state  = 1280
    whisper_model_load: n_text_head   = 20
    whisper_model_load: n_text_layer  = 32
    whisper_model_load: n_mels        = 80
    whisper_model_load: f16           = 1008
    whisper_model_load: type          = 5
    whisper_model_load: mem required  = 3342.00 MB (+   71.00 MB per decoder)
    whisper_model_load: adding 1608 extra tokens
    whisper_model_load: model ctx     = 2950.97 MB
    ��isper_model_load: unknown tensor '0y��q
      K.7��eţ�k�ؠ��	�͠0@.g7�D��^#��9]�|���N(f�fm����:�@lc��QwO�oezg{��-!�Ě���' in model file
    whisper_init_no_state: failed to load model


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Quantization model support? #25

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Quantization model support? #25

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions