Position Overview
Senior Engineer – Enterprise AI Services
Within Platform Engineering & Enterprise AI Services, you will design, build, and operate observability capabilities for AI and LLM workloads that power TR's AI‑driven products. You will own the end‑to‑end observability stack for AI, enabling product teams, data scientists, and AI engineers to understand model behavior in production, detect issues early, and make data‑driven improvements to model quality, latency, and cost.
Responsibilities
- Serve as the Kubernetes expert for AI services, defining and operating deployment standards for scalability, resilience, security, and performance.
- Own the AI observability platform, implementing tools such as Braintrust and Langfuse to support tracing, evaluation, analytics, and monitoring of LLM/ML workloads.
- Define and standardize telemetry across AI products, including traces, metrics, logs, evaluations, and feedback, while ensuring governance, privacy...