ML Compiler Engineer

HuggingFace · Remote (Global) · $180k - $320k
full-time senior

About this role

Optimize model inference for HuggingFace's inference API. Work on model compilation, quantization, and hardware-specific optimization. Make models run faster and cheaper for millions of users.

Requirements

Experience with ML compilers (TVM, XLA, TensorRT) or model optimization. Strong C++/Python. Understanding of hardware architectures (GPU, TPU).