Data Engineer

  • Kitchener
  • Applyboard

The Role:

The data engineering team is an experienced team, responsible for supporting our product development and the entire organization. In addition to building ETL pipelines to automate analytics and building integrations between systems , the team is responsible for building and maintaining the infrastructure used to host these pipelines and integrations. The team is also responsible for building and maintaining data access components and providing tooling and analytics that are required for our predictive/ML models.

What you will be doing:

  • Build and maintain analytics with Python (pandas/pyspark)
  • Build and maintain ETL pipelines on AWS (EC2/Glue ETLs/Airflow)
  • Build and maintain Infrastructure components to support our pipelines and integrations(CDK)
  • Setup and maintain integrations between different systems to enable data flow between these systems (Appflow)
  • Actively contribute to shaping the direction of our data platform including architecting our data warehouse, machine learning deployment infrastructure, and ETL/ELT workflows
  • Gather and understand data requirements by working with stakeholders across multiple teams
  • Working closely with Engineering, IT, and Security to build processes and standards for our data science platform and how it integrates with data sources across the company
  • Developing ingestion, transformation, and cleansing pipelines to prepare a variety of structured and unstructured data sources for data analytics
  • Maintaining our data platform including managing and improving our redshift cluster and monitoring our data pipelines
  • Developing infrastructure using CDK to deploy data products to internal and external users
  • Providing operational support to the data science team
  • Being a go-to person about data-related questions company-wide

What you bring to this role:

  • Bachelor’s degree in Engineering, Computer Science, Mathematics, or a related technical discipline
  • 4+ years experience in the data engineering field
  • Experience in setting up and maintaining a high volume of ETL pipelines
  • Experience in setting up ETL orchestration
  • Familiarity with infrastructure as code (CDK or Terraform) is a plus
  • Advanced knowledge of SQL and knowledge of NoSQL (MongoDB)
  • Ability to communicate effectively with people who are both highly technical, and non-technical alike
  • Strong analytical skills and an understanding of data science
  • Driven, passionate and creative, and thrives in a fast-paced environment
  • Knowledge of data modeling and system design using UML
  • Experience with AWS computing (eg. EC2, Lambda) and data storage technologies (eg. Redshift)

Tech Stack:

  • PostgreSQL
  • Python
  • Pandas
  • Nice to have Pyspark
  • Nice to have CDK or Terraform
  • AWS