Position Overview
Site Reliability Engineer (80-100%)
As part of the modernization of our IT platform, we are looking for a Platform‑oriented Site Reliability Engineer (SRE) who can support us in increasing our maturity around containers and the related ecosystem.
Responsibilities
- Ensure the reliability, scalability, and performance of the OpenShift platform and its ecosystem (Kong, Kafka, Vault).
- Manage the platform lifecycle: provisioning, updates, patching, and capacity planning.
- Implement and maintain observability solutions (metrics, logs, tracing) using Prometheus, Grafana, and associated tools.
- Establish governance for observability and operability of applications hosted on the platform.
- Handle incident response and perform root‑cause analysis for platform‑related issues.
- Collaborate with security teams to apply best practices and ensure compliance in containerized environments.
- Contribute to the continuous ...