JobTarget Logo

Senior Machine Learning Systems Engineer, Ads ML Experience Platform in United States at Jobgether

NewJob Function: Engineering
Jobgether
United States, United States
Posted on
New job! Apply early to increase your chances of getting hired.

Explore Related Opportunities

Job Description

Senior Machine Learning Systems Engineer, Ads ML Experience Platform

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Senior Machine Learning Systems Engineer, Ads ML Experience Platform based in the United States.

This role sits at the core of a large-scale machine learning ecosystem powering Ads ML development and experimentation. You will design and build next-generation infrastructure that accelerates the full ML lifecycle, from offline experimentation to production training, evaluation, and deployment. The environment is highly technical, fast-paced, and deeply collaborative, working closely with ML engineers, researchers, and platform teams. You will contribute to systems that enable reproducible research, scalable model iteration, and automated ML workflows. A key focus is advancing developer experience through robust tooling and intelligent automation. The role also explores emerging agentic AI systems that support autonomous and human-in-the-loop workflows. Your work will directly impact the speed, reliability, and scalability of ML innovation across a global platform such as Reddit.

Accountabilities:

In this role, you will lead the design and development of scalable ML infrastructure that powers experimentation, training, and deployment workflows across Ads ML systems.

  • Build and evolve large-scale offline ML experimentation platforms enabling reproducibility, evaluation, and model promotion workflows.
  • Develop distributed training orchestration systems supporting hyperparameter tuning, retraining, and evaluation pipelines.
  • Design infrastructure for experiment tracking, metadata management, lineage, artifact versioning, and model registries.
  • Create automated workflows for model promotion, rollback, compliance validation, and continuous monitoring.
  • Collaborate with ML engineers and researchers to improve experimentation velocity and platform efficiency.
  • Contribute to the design of agentic AI systems enabling multi-agent orchestration and intelligent workflow execution.
  • Ensure systems are reliable, scalable, and optimized for high-performance ML development at production scale.
Requirements:

This role requires strong expertise in large-scale distributed systems and hands-on experience building production-grade ML platforms and infrastructure.

  • 5+ years in platform engineering, distributed systems, or large-scale infrastructure development.
  • 2+ years building production ML infrastructure, developer platforms, or AI tooling.
  • Strong experience with ML workflow orchestration and distributed data processing frameworks (e.g., Spark, Ray, Flink).
  • Hands-on experience with orchestration tools such as Airflow, Kubeflow, Argo, or equivalent systems.
  • Proven ability to build and maintain ML experimentation platforms, model registries, or training pipelines.
  • Strong programming skills in Python and familiarity with scalable software engineering practices.
  • Experience with cloud-based ML systems and production deployment environments.
  • Exposure to agentic AI systems, multi-agent workflows, or autonomous orchestration frameworks is a strong plus.
  • Excellent communication skills with the ability to translate technical complexity into clear insights for diverse stakeholders.
Benefits:
  • Competitive base salary with additional equity (RSUs) and potential bonus eligibility
  • Comprehensive medical, dental, and vision insurance coverage
  • 401(k) retirement plan with employer matching
  • Generous paid time off, including vacation, holidays, and parental leave
  • Equity participation in a high-growth, impact-driven engineering environment
  • Flexible work arrangements with remote eligibility across supported regions
  • Professional development opportunities in advanced ML systems and AI infrastructure
  • Inclusive, collaborative engineering culture focused on innovation and impact.
How Jobgether works:
We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
#LI-CL1

Job Location

United States, United States

Frequently asked questions about this position

Continue to apply
Enter your email to continue. You’ll be redirected to the employer’s application.
By clicking Continue, you understand and agree to JobTarget's Terms of Use and Privacy Policy.