⏰ full-time

ML, AWS Neuron, Model Inference

🏢

Confidential

                    Location
                    📍 Cupertino, United States
                

                    Posted
                    📅 April 12, 2026
                

                    Work Type
                    ⏰ full-time
                

Position Overview

                    The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon’s custom machine learning accelerators, Inferentia and Trainium.

The AWS Neuron SDK, developed by the Annapurna Labs team at AWS, is the backbone for accelerating deep learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit includes an ML compiler, runtime, and application framework that seamlessly integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance.

The Inference Enablement and Acceleration team is at the forefront of running a wide range of models and supporting novel architecture alongside maximizing their performance for AWS's custom ML accelerators. Working across the stack from PyTorch till the hardware-software boundary, our engineers build systematic infrastructure, innovate new me...

Apply Now

Submit Application →

Quick and easy application process

Job Details

⏰

Employment Type

full-time

📊