Principal AI Research Scientist Post-Training Alignment in Canada Creek, Nova Scotia at Jobgether
Explore Related Opportunities
Job Description
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Principal AI Research Scientist Post-Training Alignment in Canada.
This role sits at the forefront of foundation model research, focusing on post-training, alignment, and reinforcement learning for advanced AI systems. You will work on shaping how large-scale models behave, reason, and interact with real-world constraints, with a strong emphasis on reliability, controllability, and safety. The environment blends cutting-edge academic research with direct product impact, allowing your contributions to move quickly from experimentation to deployment. You will design and evaluate long-horizon reasoning systems, agentic behaviors, and alignment methodologies grounded in both human feedback and structured, domain-based signals. Working alongside world-class researchers and engineers, you will help define evaluation standards and readiness criteria for next-generation AI systems. This is a highly influential role for someone passionate about advancing frontier AI while ensuring robust and responsible model behavior at scale.
- Lead research and development in post-training methods for foundation models, including reinforcement learning, preference optimization, and alignment techniques such as RLHF, RLAIF, DPO, and PPO.
- Design and develop novel algorithms that improve model reliability, controllability, reasoning ability, and alignment with human and system objectives.
- Define and execute experimental frameworks to evaluate model behavior, robustness, safety, and long-horizon reasoning performance.
- Architect evaluation systems for agentic workflows, tool use, and real-world task completion, leveraging both human and automated signals.
- Make principled decisions on when improvements should be addressed through pre-training, post-training, or system-level design changes.
- Lead model analysis and interpretability efforts to understand failure modes, trade-offs, and emergent behaviors in large-scale systems.
- Collaborate with infrastructure teams to build scalable, reproducible post-training pipelines and support large-scale experimentation.
- Establish model readiness criteria and provide clear go/no-go recommendations for production deployment and releases.
- Contribute to scientific publications, patents, and external research visibility at leading ML and AI conferences.
- Communicate technical risks, limitations, and strategic trade-offs to both technical peers and senior stakeholders.
- Deep expertise in reinforcement learning for foundation models and strong command of post-training methodologies such as RLHF, RLAIF, DPO, PPO, or related approaches.
- PhD or equivalent industry research experience in machine learning, reinforcement learning, AI, or closely related fields.
- Proven track record in leading or mentoring research teams in academia, industry labs, or advanced AI organizations.
- Strong publication history in top-tier ML/AI venues such as NeurIPS, ICML, ICLR, CVPR, or SIGGRAPH.
- Experience in alignment research, preference learning, agentic AI systems, or large-scale model behavior optimization.
- Strong intuition for model behavior, failure modes, and trade-offs in post-training and alignment settings.
- Experience designing evaluation systems and defining model readiness criteria for deployment.
- Familiarity with large-scale training infrastructure and compute/resource trade-offs in ML systems.
- Ability to communicate complex technical concepts clearly to both technical and non-technical audiences.
- Experience working with or deploying production AI systems in applied or research-to-production environments.
- Prior experience in frontier AI labs or equivalent high-impact research organizations is highly valued.
- Competitive compensation package including base salary, performance bonuses, and potential stock grants
- Flexible work arrangements, including remote options across Canada and hybrid setups in major hubs
- Comprehensive health, dental, and wellness coverage
- Opportunities to publish and present research at top-tier global AI conferences
- High-impact research environment with direct pathways to production and real-world deployment
- Access to large-scale compute infrastructure and advanced AI research tooling
- Strong culture of innovation, collaboration, and scientific rigor
- Inclusive and diverse workplace committed to belonging and equal opportunity.