Skip to content
View Mocchibird's full-sized avatar
😄
yes
😄
yes

Highlights

  • Pro

Block or report Mocchibird

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Mocchibird/README.md

Hi, I'm Hyun-Min Chang 🦉

MSc EE/IT at ETH Zürich · AI Research Intern at Huawei Research Center Switzerland

I work on low-level ML systems, specializing in kernel development, benchmarking, and hardware-aware performance optimization for specialized accelerators.

Focus

  • Custom kernel development for Ascend NPUs
  • Benchmarking and performance analysis for ML workloads
  • Quantization, fused operators, and efficient inference
  • Embedded and resource-constrained ML systems

Selected work

Upstream contributions

Public contribution work to huawei-csl/pto-kernels:

  • PR #62 — Fast Hadamard fused with dynamic quantization to int4
  • PR #49 — Fast Hadamard fused with fp16 → int8 dynamic quantization
  • PR #26 — PTO-ISA matmul with L2 cache locality optimization

Highlighted repositories

  • pto-kernels
    Active development fork for Ascend NPU kernel work, experiments, benchmarking, and upstream contribution preparation.

  • pto-kernels-plots
    Benchmark plots and performance analysis for kernel development and PR evaluation.

  • health-metrics
    Self-hostable Streamlit app for tracking personal health metrics with local SQLite storage and authenticated editing.

  • MLonMCU
    Embedded ML / microcontroller-related coursework and project work.

Interests

ML systems · performance engineering · kernel optimization · compilers · hardware-aware ML

Contact

Pinned Loading

  1. pto-kernels pto-kernels Public

    Forked from huawei-csl/pto-kernels

    Custom kernel collections using https://gitcode.com/cann/pto-isa

    C++

  2. pto-kernels-plots pto-kernels-plots Public

    Benchmark plots and performance analysis for upstream pto-kernels development

  3. health-metrics health-metrics Public

    Self-hostable Streamlit app for tracking personal health metrics with local SQLite storage and authenticated editing

    Python

  4. ETH-PBL/MLonMCU ETH-PBL/MLonMCU Public

    [ETHZ Course] Official exercise material of Machine Learning on MCUs

    C 12 13