Big Data Engineer (Databricks)
Big Data Engineer (Databricks)Apply now »Date: Aug 1, 2024Location: Atlanta, GA, US, 30308Company: Norfolk Southern Corp.Norfolk Southern offers a unique opportunity to be part of our proud legacy that spans nearly 200 years. We are a customer-centric, operations-driven team dedicated to advancing safety, serving communities, and driving innovation for tomorrow's rail. As part of Norfolk Southern, you’ll join a collaborative team where there are opportunities for growth across the organization. We are building a culture where everyone can thrive by owning and driving exceptional results, being humble and leading with trust, serving our customers with excellence, and collaborating and coaching to win.Job DescriptionNorfolk Southern Corporation is currently seeking a Big Data Engineer with an affinity for working with others to create successful solutions. Join a smart, highly skilled team with a passion for technology, where you will work on our state-of-the-art Big Data Platforms. This Data Engineer with a balance of theoretical business intelligence and data analytics knowledge and hands-on experience and exposure to real-world BI problem-solving. In this role, you will participate in all phases of the Data Engineering life cycle and will independently and collaboratively write project requirements, architect solutions, and perform data ingestion development and support duties.In this role, you will work with people from many areas of the company to understand their data and BI needs and perform detailed requirements gathering along with analysis in partnership with our team of Data Modelers. Engage our BI development team to agree on effort estimates and then oversee the progress of the project through its completion.Responsibilities
Defines data requirements, gather, and wrangle large scale of structured and unstructured data, and validate data by running various data tools in the Data Environment.
Supports the standardization, customization and ad-hoc data analysis, and will develop the mechanisms to ingest, analyze, validate, normalize and clean data.
Creates data policy and develop interfaces and retention models which requires synthesizing or anonymizing data.
Implements statistical data quality procedures on new data sources, and by applying rigorous iterative data analytics, supports Data Scientists and analytics and insights creation in data sourcing and preparation to visualize data and synthesize insights of commercial value.
Develops and maintains data engineering best practices and contributes to Insights on data analytics and visualization concepts, methods and techniques.
Works closely with the data science and business intelligence teams to develop data models and pipelines for research, reporting, and machine learning.
Builds data pipelines that clean, transform, and aggregate data from disparate sources.
Employs a variety of languages and tools (e.g. scripting languages) to marry systems together.
Applies knowledge of Data Architecture components, leads project teams from requirements to implementation.
Skills Required
3-5 years of being in a Data Engineering role manipulating, processing, and extracting value from large datasets.
3+ years of Big Data tools like Hadoop, Spark, Spark SQL, Kafka, Sqoop, Hive, S3, HDFS.
3+ years building, testing, and optimizing ‘Big Data’ data ingestion pipelines, architectures, and data sets with Tibco, IBM or others.
Databricks UI, Managing Databricks Notebooks, Delta Lake with Python, Delta Lake with Spark SQL, Delta Live Tables, Unity Catalog.
High-velocity high-volume stream processing with Apache Kafka and Spark Streaming.
Strong SQL skills with ability to write intermediate complexity queries.
ETL experience with PySpark, Spark SQL or similar.
Agile Scrum, Kanban or SAFe experience.
Skills Desired
Python (and/or Scala) and PySpark/Scala-Spark.
Database solutions like Snowflake, Kudu/Impala, Delta Lake or BigQuery.
NoSQL databases, including HBASE and/or Cassandra.
Azure, AWS Serverless technologies, like, S3, Kinesis/MSK, lambda, and Glue.
Messaging Platforms like Kafka, Amazon MSK & TIBCO EMS or IBM MQ Series.
Education DesiredBachelor’s Degree, preferably in Information Systems, Computer Science, Computer Information Systems or related field.Work ConditionsEnvironment: Office Remote, ideal candidate must be based within 60 miles Atlanta.Shift Work: NoOn-Call: YesWeekend Work: NoCompany OverviewSince 1827, Norfolk Southern Corporation (NYSE: NSC) and its predecessor companies have safely moved the goods and materials that drive the U.S. economy. Today, it operates a customer-centric and operations-driven freight transportation network. Committed to furthering sustainability, Norfolk Southern helps its customers avoid 15 million tons of yearly carbon emissions by shipping via rail. Its dedicated team members deliver more than 7 million carloads annually, from agriculture to consumer goods, and is the largest rail shipper of auto products and metals in North America. Norfolk Southern also has the most extensive intermodal network in the eastern U.S., serving a majority of the country’s population and manufacturing base, with connections to every major container port on the Atlantic coast as well as the Gulf of Mexico and Great Lakes. Learn more by visiting www.NorfolkSouthern.com .At Norfolk Southern, we believe in celebrating our individuality. By leveraging the unique backgrounds and viewpoints of our employees, we can create a culture of innovation, respect, and inclusion. We know that employees thrive in a workplace where differing viewpoints, ideas, and experiences are freely shared and valued. As such, we encourage all employees to contribute their distinctive skills and capabilities to our organization.Equal employment opportunities are available to all applicants regardless of race, color, religion, age, sex, national origin, disability status, genetic information, veteran status, sexual orientation, and gender identity. Together, we power progress.