20 Feb
Site Reliability Engineer
Michigan, Annarbormi 00000 Annarbormi USA

Vacancy expired!

QUALIFICATIONS

  • Bachelor's degree in computer science or equivalent experience
  • 5+ years public facing production application support experience in high uptime / high transaction volume environment
  • 5+ years UNIX administration experience including diagnosis of performance issues, package management, load estimation, kernel tuning, networking configuration, etc.
  • 4+ years software engineering experience (Java, C, C, Python, Go)
  • MUST HAVE Strong scripting skill
  • MUST have worked with automation tools such as Terraform and Puppet
  • Understand or worked with CI/CD environment and tools such as Jenkins
  • Understanding of networking principles, esp. TCP/IP
  • Excellent troubleshooting and analytic skills
  • Ability to work independently on large, complex projects with minimal guidance
  • MUST HAVE experience deploying and operating Container technologies with Kubernetes in Production
  • MUST HAVE experience creating and managing Kubernetes cluster deployment, configuration and operations using Helm/YAML
  • Experience with Project Contour as Kubernetes ingress controller will be considered a big plus
  • MUST HAVE experience using Infrastructure-as-code (IaC) to automate various aspects of site operations
  • Experience with deploying and running Production workloads in Cloud environment will be a plus
  • Experience working with Splunk to identify and troubleshoot issues is necessary (Splunk query experience is critical)
GENERAL RESPONSIBILITIES (50%) eCommerce Administration
  • Engineer extensive scripting and automation to install and operate applications with minimal manual intervention
  • Evaluate, test, deploy and maintain both custom developed and third party software upgrades
  • Maintain SDLC systems such as test environments, source control and automated build/test/deploy systems
  • Provide developer support ongoing, frequently embedded in development teams to facilitate collaboration
  • Create & maintain application architecture and troubleshooting documentation
  • Manage configuration and operations of Kubernetes clusters using Infrastructure-as-code approaches
(30%) Web Production Support
  • Provide 24x7 production support as part of a team rotation, resolving or escalating issues as appropriate
  • Maintain production services to highly demanding SLAs
  • Take ownership of production issues, working closely with infrastructure and development teams on issue resolution
  • Support releases on a regularly scheduled basis, as well as emergency releases as needed
  • Deploy application and data changes to all environments as needed
(20%) Planning, Design and Implementation
  • Design and implement new environments, services and application architecture modifications
  • Research, evaluate and implement operational improvements, application packages and architectural modifications
  • Participate in change control, release planning, and other operational planning
  • Remain current on industry leading solutions in both private and public cloud hosting (VMWare, Xen, KVM, Amazon Web Services (AWS), Azure, Google App Engine, Kubernetes etc.)
  • Remain current on modern open-source persistence technologies (Hazelcast, BDB, Project Voldemort, MEMCACHED, etc.)
  • Remain current on modern containerization technologies (Docker, vSphere Integrated Containers, Kubernetes)

    Vacancy expired!


    Report job