24 Sep
Lead Big Data Java Developer
Texas, Irving , 75014 Irving USA

Vacancy expired!

Job Description :

  • Overall 10+ years of experience in IT
  • Must have 6+ Years of Java development Experience
  • Must have project lead experience
  • 4+ Years of ETL pipeline development experience in Apache Flink using Java
  • 4+ Years of experience working on event streaming platforms like Kafka/Pulsar
  • Experience in cloud environment, specially Google Cloud Platform
  • Scripting experience in bash, Python
  • At least 2+years of hands-on experience with Hadoop applications using Google Cloud Platform (GOOGLE CLOUD PLATFORM)
  • Perform data migration and conversion activities on different applications and platforms.
  • Design , development and testing of data ingestion pipelines, perform end to end automation of ETL process for various datasets that are being ingested into the big data platform.
  • Perform data profiling/analysis, discovery, analysis, suitability and coverage of data, and identify the various data types, formats, and data quality issues which exist within a given data source.
  • Strong experience building data ingestion pipelines (simulating Extract, Transform, Load workload) in data warehouse and database architecture
  • Hands-on development experience using open source big data components such as Hadoop, Scala, Hive, Pig, Spark, HBase, HDFS, YARN, Sqoop, NiFi, Storm, Impala, Hawk, Oozie, Mahout, Flume, Kafka, ZooKeeper etc. preferably with Cloudera / Hortonworks
  • Develop transformation logic, interfaces and reports as needed to meet project requirements.
  • Participate in discussion for technical architecture, data modeling and ETL standards, collaborate with Product Managers, Architects and Senior Developers to establish the physical application framework (e.g. libraries, modules, execution environments)
  • Develop integrated automated test suites to validate end to end data pipeline flow, data transformation rules, and data integrity.
  • Develop tools to measure the data quality and visualize the anomaly pattern in source and processed data.
  • Integrate automated processes into continuous integration workflows.
  • Contribute to data quality assurance standards and procedures.

Vacancy expired!


Related jobs

Report job