Skip to content
@aiha-lab

aiha-lab

Popular repositories Loading

  1. MX-QLLM MX-QLLM Public

    LLM Inference with Microscaling Format

    Python 34 5

  2. Attention-Head-Pruning Attention-Head-Pruning Public

    Layer-wise Pruning of Transformer Heads for Efficient Language Modeling

    Python 22 1

  3. TSLD TSLD Public

    [NeurIPS 2023] Token-Scaled Logit Distillation for Ternary Weight Generative Language Models

    Python 18 1

  4. TernGEMM TernGEMM Public

    TernGEMM: General Matrix Multiply Library with Ternary Weights for Fast DNN Inference

    C++ 14 1

  5. InfiniPot-V InfiniPot-V Public

    [NeurIPS 25] InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding

    Python 14

  6. AI-thermometer AI-thermometer Public

    Python 10 2

Repositories

Showing 10 of 30 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…