04 Nov
Senior Data Engineer
California, Sanfrancisco , 98101 Sanfrancisco USA

Our client is in search of highly collaborative Data Engineers who want to solve problems that drive business value. You should be someone that has a strong sense of ownership and enjoy hands-on technical work. You will work in cross-functional Agile project teams alongside data scientists, machine learning engineers, other data engineers, project managers, and industry experts. You will work hand-in-hand with clients, from data owners, users, and fellow engineers to C-level executives.As a Senior Data Engineer, you will:

  • Own the technical platform for advanced analytics engagements, spanning data science and data engineering work
  • Design and build data pipelines for machine learning that are robust, modular, scalable, deployable, reproducible, and versioned
  • Create and manage data environments and ensure information security standards are maintained at all times
  • Own and be accountable for the delivery of technical work streams while also mentoring and guiding more junior colleagues
  • Understand clients data landscape and assess data quality
  • Map data fields to hypotheses and curate, wrangle and prepare data for use in advanced analytics models
  • Have the opportunity to contribute to R&D projects and internal asset development
  • Degree in computer science, engineering, mathematics, or equivalent work experience
  • Ability to write clean, maintainable, scalable, and robust code in an object-oriented language, e.g., Python, Scala, Java, in a professional setting
  • Proven experience building data pipelines in production for advanced analytics use cases
  • Experience working across structured, semi-structured, and unstructured data
  • Practical knowledge of software engineering concepts and best practices, inc. DevOps, DataOps, and MLOps would be considered a plus
  • Familiarity with distributed computing frameworks (e.g. Spark, Dask), cloud platforms (e.g. AWS, Azure, Google Cloud Platform), containerization, and analytics libraries (e.g. pandas, NumPy)
Using the right tech for the right task is vital. Client often leverages the following technologies: Python, PySpark, the PyData stack, SQL, Airflow, Databricks, as well as in house open-source data pipelining framework. Container technologies such as Docker and Kubernetes; Cloud Solutions AWS, Google Cloud Platform, and Azure in addition to all of the latest tools to benefit you in your career.

Related jobs

Report job