16 Apr
SRE Software Engineer
California, Mountainview , 94035 Mountainview USA

Vacancy expired!

Position is bonus eligiblePrestigious Enterprise Company is currently seeking a Site Reliability Software Engineer to implement tools and processes necessary to achieve required SLOs for our Platform.

Responsibilities:
  • Define and implement CI/CD pipelines.
  • Automate delivery of platform services using infrastructure-as-a-code. Build self-service playbooks for platform which can be consumed across globally distributed teams.
  • Define and implement incident response management process, deploy necessary tools.
  • Fix support and escalation issues.
  • Conduct post-incident reviews.
  • Collaborate with application and business stakeholders to ensure high-quality product is developed and deployed in production. Work diligently with other engineering teams to ratify release processes necessary to meet business goals.
  • Drive continuous improvement process

Qualifications:
  • Expert knowledge of one of the major public cloud platforms (Azure, AWS, Google Cloud Platform)
  • Hands-on programming experience in Python or other object-oriented programming languages.
  • Expert knowledge of Infrastructure and Application Monitoring tools: Prometheus, Grafana, DataDog, etc
  • Experience implementing IaC concepts using Terraform, Chef, Puppet.
  • Experience with Elasticsearch, Kibana
  • Experience administering Databases
  • Expert in Linux administration.
  • Expert knowledge of Docker, Helm.
  • Experience implementing CI/CD for cloud native applications.
  • Experience with deploying applications that utilize Service Mesh
  • Experience administering Kubernetes clusters.
  • Experience defining and implementing incident response management processes.
  • Bachelor’s degree
  • 8+ years’ experience in software engineering

Preferred Skills:
  • Master’s degree
  • Understanding of GitOps principals.
  • Experience implementing secure and compliant Kubernetes platforms.
  • Experience deploying and managing stateful distributed service in Kubernetes.
  • Experience with security scanning tools.
  • Experience with intrusion detection systems.
  • Experience with various messaging systems, such as Kafka or RabbitMQ
  • Working knowledge of Databricks, Team Foundation Server, TeamCity, Octopus deploys and DataDog

Vacancy expired!


Related jobs

Report job