🌍 Global Opportunities
⚑ Updated Hourly
πŸŽ“ Student Friendly
⏰

parttimejobs.work

Flexible Work, Better Balance

⏰ full-time

Software Engineer II - AI/ML, AWS Neuron, LLM Inference, AI/ML, AWS Neuron, Model Inference

Confidential
Location πŸ“ Cupertino, United States
Posted πŸ“… April 12, 2026
Work Type ⏰ full-time

Position Overview

The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon’s custom machine learning accelerators, Inferentia and Trainium.

The AWS Neuron SDK, developed by the Annapurna Labs team at AWS, is the backbone for accelerating deep learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit includes an ML compiler, runtime, and application framework that seamlessly integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance.

The Inference Enablement and Acceleration team is at the forefront of running a wide range of models and supporting novel architecture alongside maximizing their performance for AWS's custom ML accelerators. Working across the stack from PyTorch till the hardware-software boundary, our engineers build systematic infrastructure, innovate new me...

Apply Now

Submit Application β†’

Quick and easy application process

Job Details

⏰
Employment Type
full-time
πŸ“Š
Category
Computer Occupations
🏠
Work Arrangement
On-site
πŸ“
Location
Cupertino, United States