Senior Data Engineer

Data Engineer in San Francisco, CA

Posted 2019-02-07

Shippo lowers the barriers to shipping for businesses around the world. As free and fast shipping becomes the norm, better access to shipping is a competitive advantage for businesses. Through Shippo, e-commerce businesses, marketplaces, and platforms are able to connect to multiple shipping carriers around the world from one API and dashboard. Businesses can get shipping rates, print labels, automate international documents, track shipments, and facilitate returns. Internally, we think of Shippo as the building blocks of shipping. Shippos are a diverse set of individuals. We look for culture and skills add in every new person. Join us to build the foundations of something great, roll up your sleeves and get important work done everyday. Founded in 2013, we are a proud team based out of San Francisco. Shippo’s investors include Bessemer Venture Partners, Union Square Ventures, Uncork Capital, VersionOne Ventures, FundersClub and others.

As a Sr. Data Engineer, you will be responsible for designing and building systems to collect, process and store events at massive scale to gain operational and business insights into the performance and optimization of shipping services.

- Design, implement and maintain data extraction, processing and storage processes in large scale data systems (data pipelines, data warehouses) for internal and customer facing analytics, and reporting features.
Implement and maintain machine learning systems (feature generation, learning, evaluation, publishing) primarily using Spark for our data scientists.
- Design and implement systems that integrate data from various data sources, internal and external, to ensure consistency, quality, integrity, and availability of data sets and insights.
- Collaborate with engineers, product managers, data scientists and data analysts to understand needs and requirements.
- Design, build and launch new data models and datasets in production.
- Define, manage and own SLA for datasets across the different storage layers.
- Maintain and improve existing systems and processes in production, and prioritize work that needs to be completed to achieve that.

- 4+ years working experience as a data engineer.
- Ability to design and implement ETL processes using batch and streaming frameworks such as Hadoop, HDFS, MapReduce and Spark.
- Work experience with RDBMS, such as PostgreSQL or MySQL, NoSQL and columnar data stores.
- Investigate, analyze, identify, and debug data related issues to ensure stability, quality, and integrity of datasets.
- Familiar with columnar data warehouse technologies, in particular Redshift.
- Understand business processes, overall application components, and how data is gathered; and design a data model that ties the application telemetry data to metadata, and transactional data.
- Build expertise and own data quality for various datasets.
- Fluent in scripting languages such as Python, Ruby or Perl.
- Collaborate with multiple teams in high visibility roles and own the solution end-to-end.
- Self-starter individual who truly enjoys a fast-paced, innovative software start-up environment with a focus on delivering business value in a teamwork centric environment, groundbreaking technology.
- Excellent written, oral communication, and presentation skills.
- BS or MS in Computer Science or related technical discipline or equivalent job experience.

- Designing, building, monitoring, managing and maintaining large data processing pipelines using frameworks and patterns such as MapReduce, Spark and Pig; and distributed columnar data warehouses including but not limited to Redshift and Druid.
- Batch and streaming data transport using traditional ETL, AWS Kinesis and Kafka.
- Workflow management tools such as Airflow; and data serialization formats such as Avro and Parquet; ; and data modeling concepts, methodologies and best practices.
- Machine learning infrastructure such as Tensorflow or MXNet.
- Cloud environments and devops tools; working experience with AWS and its associated products.

- Benefits: medical, dental, vision (90% covered by the company, incl. dependents)
- Take-as-much-as-you-need vacation policy + flexible work hours, remote working possible
- Free lunch / drinks / snacks
- Fun team events outside of work hours - happy hours, “escape the room” adventures, hikes, and more!

Ready to be seen?

Apply now to have the opportunity to be considered for similar jobs at leading companies in the Seen network for FREE.

Be seen in a new Data Engineer job

Skip the search

Zero stress and one profile that can connect you directly to 1000s of companies.

Best-fit jobs—for you

We’ll take it from there. After you tell us what you’re looking for, we’ll show you off to matches.

Free Career Coaching

Boost your interview skills, map your tech career and seal the deal with 1:1 career coaching.

You get tech. We get you.

Join now and be seen.