🌍 Global Opportunities
Updated Hourly
🎓 Student Friendly

parttimejobs.work

Flexible Work, Better Balance

⏰ Full-time

Machine Learning Engineer (Toronto)

Red Hat
Location 📍 toronto, Canada
Posted 📅 June 06, 2026
Work Type ⏰ Full-time

Position Overview

Job Summary

At Red Hat we believe the future of AI is open and we are on a mission to bring the power of open‑source LLMs and vLLM to every enterprise. Red Hat AI Inference team accelerates AI for the enterprise and brings operational simplicity to GenAI deployments. As leading developers and maintainers of the vLLM project and inventors of state‑of‑the‑art techniques for model quantization and sparsification our team provides a secure platform for enterprises to build, optimise and scale LLM deployments.

What you will do

  • Contribute to the design, development and testing of various inference optimisation algorithms in the LLM‑compressor, Speculators and vLLM projects.
  • Design, implement and optimise model compression pipelines using techniques such as quantisation and pruning.
  • Develop and maintain speculative decoding frameworks to improve inference speed while maintaining model accuracy.
  • Collaborate closely with research sc...

Apply Now

Submit Application →

Quick and easy application process

Job Details

Employment Type
Full-time
📊
Category
Engineering
🏠
Work Arrangement
On-site
📍
Location
toronto, Canada