Reading:
Machine Learning Platform Engineer

Machine Learning Platform Engineer

We are seeking two senior ML Platform Engineers to join a high-impact team building scalable, cloud-native services that operationalize machine learning workflows. This role is ideal for someone who blends backend/service engineering with a strong AWS infrastructure background, has hands-on experience with Databricks, and thrives in a complex, less structured, fast-paced environment. You’ll be working alongside architects and DevOps, supporting the integration of ML pipelines and services into a robust platform ecosystem.

You will be part of a 6–7 person cross-functional team, including 4 platform engineers, 1 DevOps engineer, and data scientists. You’ll receive high-level task goals (e.g., “build a configuration service”) and will be expected to research, architect, and implement solutions with minimal hand-holding. Strong problem-solving mindset, curiosity, and self-drive are crucial. You’ll be encouraged to challenge assumptions and explore better approaches. The project builds upon an existing POC, but integration into a complex platform ecosystem is the key challenge ahead.


Responsibilities:

  • Build AWS-connected services to orchestrate, scale, and support ML workloads, using TypeScript and Python.
  • Design and implement CI/CD pipelines using GitHub Actions, enforcing high test coverage and deployment automation.
  • Support Databricks-based ML workflows, including provisioning, configuration, and resource optimization.
  • Integrate services into the broader customer platform, ensuring compatibility with authentication, data flow, and API architecture.
  • Participate in service design discussions, ensuring that implementations support high concurrency, scalability, and fault tolerance.
  • Collaborate with DevOps to ensure infrastructure is provisioned correctly (e.g., CloudFormation, Terraform for Databricks).
  • Validate and iterate on architectural plans with internal architects while being capable of autonomous research and decision-making on implementation-level concerns.
  • Engage with internal teams (data scientists, ML engineers, DevOps) to deliver robust, production-ready infrastructure and services.


Requirements

Must-have skills:

  • 8+ years of experience in software/platform engineering with focus on AWS.
  • Strong proficiency in TypeScript (Node.js services) and Python (for data/ML workflow scripting).
  • Experience working with Databricks (jobs, clusters, configurations).
  • Knowledge of Terraform (especially for Databricks provisioning) and CloudFormation (for AWS infra setup).
  • Solid understanding of MLOps fundamentals—from model orchestration to serving and monitoring.
  • Familiarity with AWS services like Lambda, Step Functions, S3, IAM, VPC but not only.
  • Practical experience with CI/CD pipelines, 100% unit testing coverage, and GitHub Actions.
  • Experience designing services that handle high-concurrency traffic and scalable workloads.



Nice-to-Have:

  • Understanding of Jupyter-based model development and what it takes to production such workflows.
  • Experience with service orchestration involving large-scale event processing or configuration management.
  • Previous work integrating services into multi-tenant SaaS platforms.


Submit Resume
Send us an application, and we’ll contact you in shortly.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.