Walmart Labs

Data Science Engineer

Data Scientist in Jersey City, NJ

Posted 2019-09-25

As part of the newly created Data Strategy and Enablement Team (DS&E), this role will be an enabler of our journey to be the world’s leading data-driven retailer. As part of this transformation, we are seeking an individual who will be responsible for establish robust data pipelines and services – including both in house developed data enabling services and systems integrations across the DS&E team to ensure we our technical deliverables meet and exceed the quality expectations.

We are looking for a highly motivated, resourceful, team-oriented individual to drive the data engineering process. You are exceptionally talented Data engineer with an outstanding track record of working with very large data sets and building robust ETL pipelines for data acquisition for internal systems and external data sources. You will be modernizing and improving the data acquisition infrastructure from the ground up. You will be working with structured/unstructured Data sets, building large scale Data processing platforms, implementing world class data governance and operational controls, solving complex performance challenges.

The Data Engineer role will report up to the Lead Data Engineer/Senior Manager Data Engineering.
Minimum Qualifications
- Play a pivotal design and hands on implementation role in improving the Data infrastructure in a project-oriented work environment.
- Influence cross functional architecture in sprint planning
- Gather and process raw data at scale from internal and external data sources and expose mechanisms for large scale parallel processing
- Design, implement and manage a near real-time ingestion pipeline into a data warehouse and Hadoop data lake.
- Process unstructured data into a form suitable for analysis and then empower state-of-the-art analysis for analysts, scientists, and APIs
- Solve complex SQL and Big Data Performance challenges.
- Mitigate Risks in our data infrastructure by developing the best in class tools and processes.
- Implement controls, policies, processes and best practices in the Data Engineering space.
- Evangelize an extremely high standard of code quality, system reliability, and performance.
- Help us improve our database deployment and change management process.
- Provide reliable and efficient Data services as part of the global data team.
- Work closely with the team on development best practices and standards.
- Be a mentor.

Who you are:
- You have prior experience with leading data engineering efforts across a variety of data systems
- You have deep understanding of commercial data sources and understand database concepts and terminology
- You have a demonstrated track record of handling multiple complex sourcing projects and delivering results in the data engineering area
- You have strong SQL experience and the ability to work on multiple aspects of a data projects including ETL, tools integrations, data results and APIs.
- You are a team player, with the courage to drive change through disruption while maintaining a respect for the team

- Very Strong engineering skills. Should have an analytical approach and have good programming skills.
- Provide business insights, while leveraging internal tools and systems, databases and industry data
- Minimum of 5+ years’ experience. Experience in retail business will be a plus.
- Excellent written and verbal communication skills for varied audiences on engineering subject matter
- Ability to document requirements, data lineage, subject matter in both business and technical terminology.
- Guide and learn from other team members.
- Demonstrated ability to transform business requirements to code, specific analytical reports and tools
- This role will involve coding, analytical modeling, root cause analysis, investigation, debugging, testing and collaboration with the business partners, product managers other engineering team
- Experience working with large data sets, experience working with distributed computing (MapReduce, Hadoop, Hive, Pig, Apache Spark, etc.) and platforms such as HDP, Cloudera etc.
- Strong Hadoop scripting skills to process petabytes of data
- Experience in Unix/Linux shell scripting or similar programming/scripting knowledge
- Real time data ingestion (Kafka)
- Experience in ETL/ processes with exposure to one or more tools such as Nifi, Talend, Informatica, SSIS etc.

Ready to be Seen?

Apply now to have the opportunity to be considered for similar jobs at leading companies in the Seen network for FREE.

Be Seen in a new Data Scientist job

Skip the search

Zero stress and one profile that can connect you directly to 1000s of companies.

Best-fit jobs—for you

We’ll take it from there. After you tell us what you’re looking for, we’ll show you off to matches.

Free Career Coaching

Boost your interview skills, map your tech career and seal the deal with 1:1 career coaching.

You get tech. We get you.

Join now and Be Seen.