19 Mar
Apache Spark
Oregon, Beavertonor 00000 Beavertonor USA

Vacancy expired!

Currently, we are looking for talented resources for one of our listed clients. If interested please reply to me with your updated resume or feel free to reach out to me for more details on

.

Title: Apache Spark

Location: BEAVERTON, OR

Duration: Full Time

Job Description:The last two items from your list are NOT mandatory. Key skills required for Big data engineer are,
  • AWS, EMR, SPARK, Python, SQL
  • The expectation is to have hands-on experience in these skills.
  • Good to have Airflow or OOZIE or any Workflow scheduling tool exp
  • For some specific roles, expectations to have Streaming experience but not for all roles.

Role responsibilities:
  • Design and implement data products and features in collaboration with product owners, data analysts, and business partners using Agile / Scrum methodology
  • Contribute to overall architecture, frameworks, and patterns for processing and storing large data volumes
  • Translate product backlog items into engineering designs and logical units of work
  • Profile and analyze data for the purpose of designing scalable solutions
  • Define and apply appropriate data acquisition and consumption strategies forgiven technical scenarios
  • Design and implement distributed data processing pipelines using tools and languages prevalent in the big data ecosystem
  • Build utilities, user-defined functions, libraries, and frameworks to better enable data flow patterns
  • Implement complex automated routines using workflow orchestration tools
  • Work with architecture, engineering leads, and other teams to ensure quality solutions are implemented, and engineering best practices are defined and adhered to
  • Anticipate, identify and solve issues concerning data management to improve data quality Build and incorporate automated unit tests and participate in integration testing efforts
  • Utilize and advance continuous integration and deployment frameworks
  • Troubleshoot data issues and perform root cause analysis
  • Work across teams to resolve operational & performance issues

The following qualifications and technical skills will position you well for this role:
  • MS/BS in Computer Science, or related technical discipline
  • 4+ years of experience in large-scale software development, 2+ years of big data experience
  • Strong programming experience, Python preferred
  • Extensive experience working with Hadoop and related processing frameworks such as Spark, Hive, etc.
  • Experience with messaging/streaming/complex event processing tooling and frameworks with an emphasis on Spark Streaming or Structured Streaming and Apache Nifi
  • Good understanding of file formats including JSON, Parquet, Avro, and others
  • Familiarity with data warehousing, dimensional modeling, and ETL development
  • Experience with RDBMS systems, SQL and SQL Analytical functions
  • Experience with workflow orchestration tools like Apache Airflow
  • Familiarity with data warehousing, dimensional modeling, and ETL development
  • Experience with performance and scalability tuning

The following skills and experience are also relevant to our overall environment, and nice to have:
  • Experience with Scala or Java
  • Experience working in a public cloud environment, particularly AWS, and with services like EMR, S3, Lambda, ElastiCache, DynamoDB, SNS, SQS, etc
  • Familiarity with cloud warehouse tools like Snowflake
  • Experience building RESTful API's to enable data consumption
  • Familiarity with build tools such as Terraform or CloudFormation and automation tools such as Jenkins or Circle CI
  • Familiarity with practices like Continuous Development, Continuous Integration and Automated Testing
  • Experience in Agile/Scrum application development

Vacancy expired!


Report job