Job Description
- Own the cloud operations of the analytics platform and drive high-quality customer experience of the product by ensuring SLA and consistent performance.
- Manage DevOps teams.
- Mentor L2 teams in troubleshooting production issues
- Enable Service Desk to do 24X7 L1 monitoring
- Track incidents and perform RCA for every incident and focus on prevention.
- Ensure SOP’s are defined and implemented religiously.
- Develop CI/CD pipelines and change management
- Ongoing change management for scale
- Communicate effectively with customer facing team to address their concerns on infrastructure availability, performance, and security
- Collaboration with Engineering team:
- Build Tools and Systems for Observability
- Plan releases and patch deployments
- Scale of applications and infrastructure to ensure best customer experience
- Infrastructure Management:
- Application performance monitoring and optimization
- Application and cloud security management
- Automation: IaC, Disaster Recovery, BCP etc.
- Cost Optimization:
- Monitor costs of all resources and trace them to respective accounts
- Identify cost optimization improvements and ensure optimal costs without compromising on performance
- Own Security of data, infrastructure, and applications. Ensure best practices are followed for HIPAA, ISO 27001 and SOC2 compliance
- Systematically develop documentation for each data solution and make sure it is up to date and reflects current business rules and definitions.
- Building strong partnerships with colleagues at all levels.