Data Engineer

Remote

Full-time


JavaScript MySQL Git PostgreSQL SQL Kafka Redshift Docker NoSQL Spark Airflow Hadoop Python Java C

Apply now

We are a team on a mission, to put accessible and affordable healthcare in the hands of every person on earth. Our mission is bold and ambitious, and it’s one that’s shared by our team who shares our values, to dream big, build fast and be brilliant.

To achieve this, we’ve brought together one of the largest teams of scientists, clinicians, mathematicians and engineers to focus on combining the ever-growing computing power of machines, with the best medical expertise of humans, to create a comprehensive, immediate and personalized health service and make it universally available.

At Babylon our people aren’t just part of a team, they’re part of something bigger. We’re a vibrant community of creative thinkers and doers, forging the way for a new generation of healthcare. We’re only as good as our people. So, finding the best people is everything to us.

We serve millions, but we choose our people one at a time…

We are looking for exceptional Data Engineers to join our growing team of analytics experts. You will be responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross-functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up.

The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems, and products. The right candidate will be excited by the prospect of optimizing or even re-designing our company’s data architecture to support our next generation of products and data initiatives.

You will be part of Babylon's US Data Insight Group working in Austin, Texas. Our aim is continuing building a dynamic data platform that offers the most comprehensive view of consumer and population health ever available. The platform provides insights to internal and external customers that will enable consumers to live smarter, and health care providers and payers to move from reactive sick-care to proactive optimization of individual and population health.

Responsibilities

    • Implementing ETL/ELT processes and production-grade data pipelines with an emphasis on flexibility, reliability, accuracy, and speed 
    • Collaborate with various technical and non-technical stakeholders such as Executive Leadership, Business Intelligence, Product, and Privacy to deliver high-quality data infrastructure at scale 
    • Continuously explore and implement new ideas and technologies related to data processing on the cloud
    • Be part of a high-impact team developing a greenfield Enterprise Data Warehouse
    • Optimize, automate, and improve analytical approaches to perform at scale
    • Develop and maintain comprehensive controls to ensure data quality, master data management, data governance, and data stewardship
    • Identify and onboard new data sources by defining business requirements. Troubleshoot and resolve any issues with data feeds

Minimum Qualifications

    • Bachelor’s degree in Computer Science, Engineering, Mathematics, Statistics, related discipline or equivalent experience
    • 2+ years professional experience in SQL: MySQL, PostgreSQL, BigQuery, Hive, or t-SQL
    • Experience with at least one of the following: ETL/ELT data pipelines, data modeling, data warehousing, or reporting
    • Strong programming expertise with at least one of the following languages: Python, Java, JavaScript, or C/C++
    • Strong communication skills with a focus on both intra- and inter-team collaboration

Preferred Qualifications

    • Master’s degree in Computer Science, Engineering, Mathematics, Statistics, or related discipline
    • Familiarity with cloud technologies– especially GCP
    • Hands-on experience with Big Data technologies such as Hadoop, Spark, Kafka, Redshift, Teradata, and others
    • Experience building and optimizing data pipelines using orchestration tools such as Azkaban, Luigi, Airflow, etc.
    • Experience with version control (e.g. TFS, SVN or Git) and CI/CD
    • Experience in healthcare

Babylon believes it is possible to put an accessible and affordable health service in the hands of every person on earth. How? By combining the ever-growing computing power of machines with the best medical expertise of humans to create a comprehensive, immediate and personalised health service and making it universally available.