Position Overview
Responsibilities
- Lead end-to-end operations and administration of Azure cloud infrastructure in a high-availability environment.
- Manage, mentor, and develop a team of L1/L2/L3 engineers, including task allocation, performance management, and career development.
- Act as the primary escalation point for critical (P1/P2) incidents, driving root cause analysis and longβterm remediation.
- Ensure effective team response to incidents, service requests, and alerts within defined SLAs.
- Architect, implement, and optimize Azure solutions for performance, scalability, and cost efficiency.
- Establish and enforce operational standards, governance policies, and best practices across the team.
- Drive automation initiatives using Infrastructure as Code (IaC) tools such as Terraform and Azure DevOps.
- Oversee monitoring, alerting, and observability frameworks to ensure proactive issue detection.
- Lead patch management,...