Apache Spark. Every minute updated vacancy announcement site. Find or subscribe to get informed new job vacancies.

02 Mar

Apache Spark

Oregon, Beavertonor

Vacancy expired!

Currently, we are looking for talented resources for one of our listed clients. If interested please reply to me with your updated resume or feel free to reach out to me for more details on

Title: Apache Spark

Location: BEAVERTON, OR

Duration: Full Time

Job Description:The last two items from your list are NOT mandatory. Key skills required for Big data engineer are,

AWS, EMR, SPARK, Python, SQL
The expectation is to have hands-on experience in these skills.
Good to have Airflow or OOZIE or any Workflow scheduling tool exp
For some specific roles, expectations to have Streaming experience but not for all roles.

Role responsibilities:

Design and implement data products and features in collaboration with product owners, data analysts, and business partners using Agile / Scrum methodology
Contribute to overall architecture, frameworks, and patterns for processing and storing large data volumes
Translate product backlog items into engineering designs and logical units of work
Profile and analyze data for the purpose of designing scalable solutions
Define and apply appropriate data acquisition and consumption strategies forgiven technical scenarios
Design and implement distributed data processing pipelines using tools and languages prevalent in the big data ecosystem
Build utilities, user-defined functions, libraries, and frameworks to better enable data flow patterns
Implement complex automated routines using workflow orchestration tools
Work with architecture, engineering leads, and other teams to ensure quality solutions are implemented, and engineering best practices are defined and adhered to
Anticipate, identify and solve issues concerning data management to improve data quality Build and incorporate automated unit tests and participate in integration testing efforts
Utilize and advance continuous integration and deployment frameworks
Troubleshoot data issues and perform root cause analysis
Work across teams to resolve operational & performance issues

The following qualifications and technical skills will position you well for this role:

MS/BS in Computer Science, or related technical discipline
4+ years of experience in large-scale software development, 2+ years of big data experience
Strong programming experience, Python preferred
Extensive experience working with Hadoop and related processing frameworks such as Spark, Hive, etc.
Experience with messaging/streaming/complex event processing tooling and frameworks with an emphasis on Spark Streaming or Structured Streaming and Apache Nifi
Good understanding of file formats including JSON, Parquet, Avro, and others
Familiarity with data warehousing, dimensional modeling, and ETL development
Experience with RDBMS systems, SQL and SQL Analytical functions
Experience with workflow orchestration tools like Apache Airflow
Familiarity with data warehousing, dimensional modeling, and ETL development
Experience with performance and scalability tuning

The following skills and experience are also relevant to our overall environment, and nice to have:

Experience with Scala or Java
Experience working in a public cloud environment, particularly AWS, and with services like EMR, S3, Lambda, ElastiCache, DynamoDB, SNS, SQS, etc
Familiarity with cloud warehouse tools like Snowflake
Experience building RESTful API's to enable data consumption
Familiarity with build tools such as Terraform or CloudFormation and automation tools such as Jenkins or Circle CI
Familiarity with practices like Continuous Development, Continuous Integration and Automated Testing
Experience in Agile/Scrum application development

Vacancy expired!

Subscribe for new vacancies

Report job

ID	#10490391
State	Oregon
City	Beavertonor
Source	Denken Solutions
Job type	Permanent
Salary	N/A
Showed	2021-03-02
Date	2021-03-02
Deadline	2021-05-01
Category	Et cetera