🌍 Global Opportunities
Updated Hourly
🎓 Student Friendly

parttimejobs.work

Flexible Work, Better Balance

⏰ Full-time

Software Development Manager, LLM Inference Model Enablement, Neuron SDK

Amazon
Location 📍 Cupertino, United States
Posted 📅 June 15, 2026
Work Type ⏰ Full-time

Position Overview

Description
DESCRIPTION
AWS Utility Computing (UC) provides product innovations, from foundational services such as Amazon Elastic Compute Cloud (EC2), to new product innovations that continue to set AWS’s services and features apart in the industry.

We develop AWS Neuron, the complete software stack for Trainium, Amazon's custom cloud-scale
machine learning accelerators. Come optimize LLMs such as Llama and GPT-OSS to run really fast on Trainium.

As the SDM for the LLM Inference Model Enablement team, you will lead a team of expert AI/ML engineers to onboard and optimize state-of-the-art open-source and customer LLMs, both dense and MoE, for inference on Neuron and Trainium and Inferentia accelerators. You will also drive improvements in model enablement speed and experience, while advancing inference usability and quality through inference features, infrastructure optimization, tools, and automation.

The ideal candidate will have a strong backgroun...

Apply Now

Submit Application →

Quick and easy application process

Job Details

Employment Type
Full-time
📊
Category
other-general
🏠
Work Arrangement
On-site
📍
Location
Cupertino, United States