02 Nov
Cloud Site Reliability Engineer
job summary:
Description- Cloud Site Reliability Engineer (SRE) for Internal Cloud. Candidates must have 4+ years of experience working with Unix/Linux Server platforms. Must be extremely proficient in Shell scripting /Java/Python/Ansible development. Must have experience with whole lifecycle of cloud services-from inception and design, through deployment, operation and support
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
- Troubleshoot issues across the entire stack: hardware, software, application and network
- Perform deep dives into both systemic and latent reliability issues; partner with engineering and operation teams across the organization to produce and roll out fixes.
- Drive standardization efforts across multiple disciplines and services in conjunction with embedded SREs throughout the organization.
- Identify and drive opportunities to improve automation for the cloud services
- Scope and create automation for deployment, management and visibility of our services
- Troubleshoot issues across the entire stack: hardware, software, application and network
- Perform deep dives into both systemic and latent reliability issues; partner with engineering and operation teams across the organization to produce and roll out fixes.
- Identify and drive opportunities to improve automation for the cloud services
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
- Troubleshoot issues across the entire stack: hardware, software, application and network
- Perform deep dives into both systemic and latent reliability issues; partner with engineering and operation teams across the organization to produce and roll out fixes.
- Drive standardization efforts across multiple disciplines and services in conjunction with embedded SREs throughout the organization.
- Identify and drive opportunities to improve automation for the cloud services
- Scope and create automation for deployment, management and visibility of our services
- Experience level: Experienced
- Minimum 5 years of experience
- Education: Bachelors