Position Overview
Role Summary
Ensure 24/7 monitoring, safe deployments, backup management, and incident response while improving performance and reliability to meet monthly availability targets.
Key Responsibilities
Cloud Operations (GCC AWS)
- Manage AWS infrastructure operations, configuration, and service reliability.
Monitoring, Alerting & Performance
- Implement and operate 24/7 monitoring/alerting and performance monitoring; improve observability.
CI/CD & Deployment
- Own pipeline management and deployment processes; enable safe releases and rollback readiness.
Backup, DR & Ops Readiness
- Own backup management, operational documentation updates, and readiness practices.
Incident Response
- Participate in incident response and ensure SLA response times and status updates are met.
Security Support