25 Jan
Data Engineer
New Jersey, Princeton , 08540 Princeton USA

Vacancy expired!

JOB RESPONSIBILITIES:
  • Develop complex SQL scripts for data analysis and extraction; develop and maintain programs as required for the ETL process.
  • Design and implement distributed data processing pipelines using Spark, Hive, Sqoop, Python, and other tools and languages prevalent in the Hadoop ecosystem. Demonstrate ability to design and implement end-to-end solutions.
  • Build utilities, user-defined functions, and frameworks to better enable data flow patterns.
  • Research, evaluate and utilize new technologies/tools/frameworks centered around Hadoop and other elements in the Big Data space.
  • Define and build data acquisitions and consumption strategies.
  • Build and incorporate automated unit tests, participate in integration testing efforts.
  • Work with teams to resolve operational & performance issues.
  • Work with architecture/engineering leads and other teams to ensure quality solutions are implemented, and engineering best practices are defined and adhered to.
  • Assist in the development and training of the IT department.

REQUIRED EXPERIENCE:
  • Bachelor’s Degree in Computer Science/Information Technology, Computer/Electrical Engineering, or related discipline.
  • Hands-on Experience with big data tools like Hadoop, Spark, Kafka, Hive, Sqoop, etc.
  • Expert in at least one of the programming languages, such as Python, Java, Scala. Python is preferred.
  • Experience with Shell Scripts.
  • Expert in SQL, such as nested queries, stored procedures, and data modeling.
  • MySQL/Oracle and/or NoSQL experience with the ability to develop, tune and debug complex SQL/NoSQL applications.
  • Experience with different data stores such as HBase, Cassandra, MongoDB, Neo4j, GraphQL.
  • Hands-on experience with Data pipeline and ELK.
  • Understanding of data pipeline deployment either on the Cloud or on-premise.
  • Good understanding of data streaming tools like Kafka or RabbitMQ.
  • Strong written and verbal communication skills.
  • Ability to work both independently and as part of a team.

PREFERRED EXPERIENCE:
  • Solid experiences with Spark including different Spark API, Spark SQL, and Spark Streaming;

    OR
  • Hands-on experience with Spark Python API, Spark Java API, or Spark Scala API, and configure the Spark Jobs;

    OR
  • Solid experiences with Hive including HUE, Joins, Partitions and Buckets.
  • Familiar with Cloud Technologies preferred, such as AWS S3, AWS RedShift, AWS EMR, AWS RDS, or similar.
  • Hands-on experience with creating a dashboard using Tableau, Spark, or PowerBI preferred

Vacancy expired!


Report job