Position Overview
Introduction
The Platform Engineering SME – Kubernetes & GPU Infrastructure serves as the authoritative subject matter expert for Kubernetes cluster configurations and GPU enabled platform design. This role provides direct validation against real, in use cluster configurations, ensuring that agent and platform designs align with actual operational, storage, and GPU topology constraints.
Required Skills & Qualifications
- 7 years hands on expertise with Kubernetes cluster operations in enterprise environments.
- Strong understanding of DaemonSets, node affinity, taints/tolerations, and scheduling mechanics.
- Experience validating designs against live cluster configurations.
- Strong knowledge of Prometheus architectures, including PVC backed persistent storage models.
- Hands on experience with GPU enabled Kubernetes clusters.
- Prior work experience at client or...