Position Overview
Requirements:
- 4 years of experience as a fullstack or backend engineer
- Strong proficiency in Python and JavaScript/TypeScript
- Experience with FastAPI / Django / Node.js and React / Next.js
- Solid understanding of distributed systems and async architectures
- Hands‑on experience deploying LLMs such as GPT‑4/4.1, Claude, LLaMA, Mistral, Mixtral
- Experience serving models using vLLM, Triton, TGI, or similar frameworks
- Strong understanding of transformer models and inference trade‑offs
- Experience with embeddings, vector search, and RAG architectures
- Experience with AWS, GCP, or Azure (GPU workloads preferred)
- Strong Docker and Kubernetes experience
- Familiarity with CI/CD pipelines for ML systems
- Experience with observability tools (Prometheus, Grafana, OpenTelemetry)
- Experience with multimodal AI (audio, video, image models)
- Experience optimizi...