- You are in charge of proactively building and implementing services to make IT and support better at their jobs. This can be anything from adjustments to monitoring and alerting to code changes in production.
- A site reliability engineer is expected to fix support escalation cases.
- The site reliability engineer job also includes tasks like building proprietary tools from scratch to mitigate weaknesses in incident management or software delivery.
- Based on post-incident reviews, site reliability engineers will need to optimize the Software Development Life Cycle (SDLC) to boost service reliability.
- Troubleshoot, debug and upgrade existing software.
- Gather and evaluate user feedback
- Recommend and execute improvements
- Create technical documentation for reference and reporting
- 3+ years networking experience
- Proven Kubernetes and DevOps experience
- Must have completed minimum two projects, end-to-end in a technical System Reliability Engineer, DevOps Engineer or Deployment Engineer role
- Understanding of support processes, KPIs & SLA management
- Ability to handle escalations and drive customer communications on daily basis
- Ability to communicate courteously and effectively with customers, third-party vendors, and partners
- Ability to manage multiple projects at the same time
- Bachelor’s Degree/Master’s Degree in Computer Science, Computer Engineering, or a related field
- Certification in Cloud e.g. AWS/Microsoft Azure