JobTarget Logo

Principal AI Research Scientist Post-Training Alignment in Canada Creek, Nova Scotia at Jobgether

NewJob Function: Science
Jobgether
Canada Creek, Nova Scotia, B0P 1V0, Canada
Posted on
New job! Apply early to increase your chances of getting hired.

Explore Related Opportunities

Job Description

Principal AI Research Scientist Post-Training Alignment

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Principal AI Research Scientist Post-Training Alignment in Canada.

This role sits at the forefront of foundation model research, focusing on post-training, alignment, and reinforcement learning for advanced AI systems. You will work on shaping how large-scale models behave, reason, and interact with real-world constraints, with a strong emphasis on reliability, controllability, and safety. The environment blends cutting-edge academic research with direct product impact, allowing your contributions to move quickly from experimentation to deployment. You will design and evaluate long-horizon reasoning systems, agentic behaviors, and alignment methodologies grounded in both human feedback and structured, domain-based signals. Working alongside world-class researchers and engineers, you will help define evaluation standards and readiness criteria for next-generation AI systems. This is a highly influential role for someone passionate about advancing frontier AI while ensuring robust and responsible model behavior at scale.

Accountabilities:
  • Lead research and development in post-training methods for foundation models, including reinforcement learning, preference optimization, and alignment techniques such as RLHF, RLAIF, DPO, and PPO.
  • Design and develop novel algorithms that improve model reliability, controllability, reasoning ability, and alignment with human and system objectives.
  • Define and execute experimental frameworks to evaluate model behavior, robustness, safety, and long-horizon reasoning performance.
  • Architect evaluation systems for agentic workflows, tool use, and real-world task completion, leveraging both human and automated signals.
  • Make principled decisions on when improvements should be addressed through pre-training, post-training, or system-level design changes.
  • Lead model analysis and interpretability efforts to understand failure modes, trade-offs, and emergent behaviors in large-scale systems.
  • Collaborate with infrastructure teams to build scalable, reproducible post-training pipelines and support large-scale experimentation.
  • Establish model readiness criteria and provide clear go/no-go recommendations for production deployment and releases.
  • Contribute to scientific publications, patents, and external research visibility at leading ML and AI conferences.
  • Communicate technical risks, limitations, and strategic trade-offs to both technical peers and senior stakeholders.
Requirements:
  • Deep expertise in reinforcement learning for foundation models and strong command of post-training methodologies such as RLHF, RLAIF, DPO, PPO, or related approaches.
  • PhD or equivalent industry research experience in machine learning, reinforcement learning, AI, or closely related fields.
  • Proven track record in leading or mentoring research teams in academia, industry labs, or advanced AI organizations.
  • Strong publication history in top-tier ML/AI venues such as NeurIPS, ICML, ICLR, CVPR, or SIGGRAPH.
  • Experience in alignment research, preference learning, agentic AI systems, or large-scale model behavior optimization.
  • Strong intuition for model behavior, failure modes, and trade-offs in post-training and alignment settings.
  • Experience designing evaluation systems and defining model readiness criteria for deployment.
  • Familiarity with large-scale training infrastructure and compute/resource trade-offs in ML systems.
  • Ability to communicate complex technical concepts clearly to both technical and non-technical audiences.
  • Experience working with or deploying production AI systems in applied or research-to-production environments.
  • Prior experience in frontier AI labs or equivalent high-impact research organizations is highly valued.
Benefits:
  • Competitive compensation package including base salary, performance bonuses, and potential stock grants
  • Flexible work arrangements, including remote options across Canada and hybrid setups in major hubs
  • Comprehensive health, dental, and wellness coverage
  • Opportunities to publish and present research at top-tier global AI conferences
  • High-impact research environment with direct pathways to production and real-world deployment
  • Access to large-scale compute infrastructure and advanced AI research tooling
  • Strong culture of innovation, collaboration, and scientific rigor
  • Inclusive and diverse workplace committed to belonging and equal opportunity.
How Jobgether works:
We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
#LI-CL1

Job Location

Canada Creek, Nova Scotia, B0P 1V0, Canada

Frequently asked questions about this position

Similar Jobs In Canada Creek, Nova Scotia

New

Research Scientist / Research Engineer

Jobgether
Canada Creek, Nova Scotia
Continue to apply
Enter your email to continue. You’ll be redirected to the employer’s application.
By clicking Continue, you understand and agree to JobTarget's Terms of Use and Privacy Policy.