20 Apr
Senior Staff HPC Infrastructure Engineer
California, Redwoodcity , 94061 Redwoodcity USA

Vacancy expired!

Job Description

Guardant’s HPC team builds and operates the computational technology backbone of the company.

This includes scalable data storage that holds PBs of genomics data, high performance compute clusters running a custom bioinformatics pipeline in production and R&D environments, and the softwareinfrastructurethat hosts an ecosystem of services for internal data processing and external data integration.To facilitateGuardantHealth’s fast growth in the next few years, the HPC team is looking for a strong technical engineer who can help maintain and help grow the HPC infrastructure during its aggressive expansion, while working with corporate IT, SQA and DevOps/SREteams.

This role can be remotely worked part-time, but requires a very hands on, on-premise presence when on rotation, minimally.

In this role, you will primarily:

  • Help manage multiple HPC clusters and cluster file systems.
  • Help research, develop and implement the next generation HPC solution
  • Troubleshoot the production system stack down to source code level e.g. shell scripts, python and others.
  • Maintains, monitors, and supports theinfrastructureenvironment and/or facilities.
  • Used and maintained enhanced production monitoring and additional capability.
  • Support improvements for increased system reliability and performance.
  • Supports in a senior role multiple systems or applications of medium to high complex (complexity defined by size, technology used, and system feeds and interfaces) with multiple concurrent users, ensuring control, integrity, and accessibility.
  • Work with offsite consultants to maintain the infrastructure
  • Work with vendors to troubleshoot, upgrade and repair systems as needed
  • Participate in a 24/7 on-call rotation

About You:

You enjoy an agile, very fast paced and highly technical environment. You are a self-driven accomplished technologist who strives to be ever improving your skills, value to the company and improve the computational infrastructure. You are dedicated toengineering excellence yet pragmatic and flexible. You have the ability to maintain the day to day support SLA while running various key projects that move the business forward.

  • 6+ years of Linux/Unix administration, knowledge of Unix network protocols, TCP/IP network fundamentals, coreinfrastructuretechnologies and virtualization
  • 6+ years of large-scale data storage and compute clusters (HPC)infrastructure
  • 4+ years working in and with on-premise and cloud-based (AWS, Google, IBM and Azure) data-centers
  • 3+ years of building software release and ops processes and automation toolset
  • 5+ years providing documentation of system administration

Following Skills Sets are Preferred:

  • Experience administering IBM’s General Parallel File System
  • Experience administering Grid Engine scheduler
  • Experience with using Bright Cluster Manager
  • Experience with cloud bursting technologies
  • Experience with wide area file systems
  • Experience with docker and container technologies
  • Experience with Kubernetes, preferably with Certified Kubernetes Administator (CKA, up to date)
  • Operatinginfrastructurecompliant with HIPAA and SOX standards

Education

B.S. in Computer Science or related field

Qualifications

  • HPC
  • Kubernetes
  • BM’s General Parallel File System

Additional Information

#LI-KH1

Vacancy expired!


Related jobs

Report job