I'm currently working on performance optimizations for Mixture-of-Experts model inference on AWS Trainium 2/3. I'm particularly interested in enabling agents to write high performance kernels and raising the attainable performance ceiling of accelerators.
I did my undergrad in EECS at Berkeley and I'm now doing a Master's in ECE at Cornell.
connect: linkedin
