29 Jul
Big Data Engineer - updated salary
New York, New york city , 10165 New york city USA

Vacancy expired!

The Big Data Engineer is responsible for data architecture, data management, data governance solutions and big data solutions. The healthcare data engineer/analyst will be a key technical member of this team, and will be responsible for big data engineering, data wrangling, data analysis and user support primarily focused on the Cloudera Hadoop platform, but in future extending to the cloud. The data engineer

must have strong hands-on technical skills including conventional ETL and SQL skills with programming as well as data science languages such as Python and R, using big data techniques.

This is a direct hire role and not available for 3rd party candidates. Must be able to commute starting in Aug/Sept 2021 at least 3 days a week to NYC office location. Send resumes and contact details via word resume

Job Responsibilities:
  • Requirements analysis, planning and forecasting for Hadoop data engineering/ingestion projectsRequirements analysis, planning and forecasting for Hadoop data engineering/ingestion
  • Operational support for data ingestion and engineering, including job monitoring; issue resolution; user support
  • Coordinate with infrastructure and offsite/offshore teams
  • Design and implement optimized Hadoop and big data solutions for data ingestion, data processing, data wrangling, and data delivery
  • Share subject matter expertise on Hadoop-related concepts and use
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Build the infrastructure required for efficient extraction, transformation, and loading of data from a wide variety of data sources
  • Assist users with technical issues related to their use of Hadoop.
  • Build data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
  • Create high-quality technical and user documentation

Minimum Qualifications
  • Bachelor’s Degree or higher in Computer Science or a related field
  • Minimum 4 years of total IT experience with 2+ years of experience with similar responsibilities and 1 year of Hadoop experience (Hadoop is a plus)
  • Strong ETL and data engineering/ingestion experience with ingesting diverse data from various sources including relational databases and files (text, csv)
  • Strong knowledge of SQL and databases, particularly Oracle and Microsoft SQL Server
  • Good knowledge of Unix/Linux including scripting
  • Knowledge of Java and Groovy
  • Programming experience, ideally in Python, R, Spark, Kafka, and a willingness to learn new programming languages to meet goals and objectives
  • Knowledge of or ability to quickly learn Hadoop concepts including Hive, Impala, Parquet, Sentry, Sqoop, Flume, Oozie, Spark, Solr, Kudu
  • Creative and innovative approach to problem-solving
  • Excellent communication and interpersonal skills at all levels of the organization
  • A willingness to explore new alternatives or options to solve data mining issues, and utilize a combination of industry best practices, data innovations and your experience to get the job done
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and find opportunities for improvement

Additional Qualifications
  • Experience with healthcare data, particularly with providers and academic medical centers is a strong plus
  • Knowledge of or familiarity with natural language processing (NLP) and machine learning
  • Experience with agile development and tools
  • Basic knowledge of statistical techniques
  • Exposure/experience with Cloud data management and services (AWS, Google Cloud Platform)

Vacancy expired!


Related jobs

Report job