25 Jan
Data Engineer
Vacancy expired!
- Develop complex SQL scripts for data analysis and extraction; develop and maintain programs as required for the ETL process.
- Design and implement distributed data processing pipelines using Spark, Hive, Sqoop, Python, and other tools and languages prevalent in the Hadoop ecosystem. Demonstrate ability to design and implement end-to-end solutions.
- Build utilities, user-defined functions, and frameworks to better enable data flow patterns.
- Research, evaluate and utilize new technologies/tools/frameworks centered around Hadoop and other elements in the Big Data space.
- Define and build data acquisitions and consumption strategies.
- Build and incorporate automated unit tests, participate in integration testing efforts.
- Work with teams to resolve operational & performance issues.
- Work with architecture/engineering leads and other teams to ensure quality solutions are implemented, and engineering best practices are defined and adhered to.
- Assist in the development and training of the IT department.
- Bachelor’s Degree in Computer Science/Information Technology, Computer/Electrical Engineering, or related discipline.
- Hands-on Experience with big data tools like Hadoop, Spark, Kafka, Hive, Sqoop, etc.
- Expert in at least one of the programming languages, such as Python, Java, Scala. Python is preferred.
- Experience with Shell Scripts.
- Expert in SQL, such as nested queries, stored procedures, and data modeling.
- MySQL/Oracle and/or NoSQL experience with the ability to develop, tune and debug complex SQL/NoSQL applications.
- Experience with different data stores such as HBase, Cassandra, MongoDB, Neo4j, GraphQL.
- Hands-on experience with Data pipeline and ELK.
- Understanding of data pipeline deployment either on the Cloud or on-premise.
- Good understanding of data streaming tools like Kafka or RabbitMQ.
- Strong written and verbal communication skills.
- Ability to work both independently and as part of a team.
- Solid experiences with Spark including different Spark API, Spark SQL, and Spark Streaming; OR
- Hands-on experience with Spark Python API, Spark Java API, or Spark Scala API, and configure the Spark Jobs; OR
- Solid experiences with Hive including HUE, Joins, Partitions and Buckets.
- Familiar with Cloud Technologies preferred, such as AWS S3, AWS RedShift, AWS EMR, AWS RDS, or similar.
- Hands-on experience with creating a dashboard using Tableau, Spark, or PowerBI preferred
Vacancy expired!