Sr. Site Reliability Engineer
job summary:? Have 5+ years of experience as an SRE? Know your way around linux including: configuration, package management, optimization and troubleshooting? Have skills in: shell scripting, python, SQL, java, nodeJS, and Amazon AWS? Have skills in: networking, storage management, and cloud architecture.? Know your way around configuration management systems: terraform and ansible? Think about systems: behaviors, edge cases, failure modes, and implementation methodologies? Collaborate and effectively communicate asynchronously.? Want to document all the things (so you can share what you learn with your future self and the team).? Have an anathema for dealing with broken things and a desire to just fix it.? Enjoy delivering and iterating quickly.? Have experience with: docker, ansible, postgreSQL, nginx, caucho resin (or tomcat).? Manage, configure and troubleshoot operating system issuesBenefits:Health, Dental and Vision InsurancePaid HolidaysPaid Time OffRetirement Plan location: FRESNO, Californiajob type: Permanentsalary: $125,000 - 135,000 per yearwork hours: 8am to 4pmeducation: Bachelors responsibilities:? Coding infrastructure automation with Ansible, Terraform, and Aplos' CI/CD platform? Developing a fully automated multi-environment observability stack based on the existing SaaS system, to improve monitoring and identifying new metrics? Help release managers deploy new versions of our code.? Develop a relationship with engineering, product, support and quality groups to help define their SLAs, and manage the infrastructure to support their success.? Documenting and improving engineering best practices, runbooks, and general documentation for: availability, reliability and scalability, as well as disaster recovery qualifications:
- Experience level: Experienced
- Minimum 4 years of experience
- Education: Bachelors