AI Research Engineer (Kernel & Inference Optimization) in UK at Jobgether
Explore Related Opportunities
Job Description
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a AI Research Engineer (Kernel & Inference Optimization) in United Kingdom.
This is an exciting opportunity for a highly technical AI engineer to contribute to the next generation of scalable and high-performance inference systems powering real-world AI applications. In this role, you will work on optimizing model serving architectures, improving latency and throughput, and enhancing deployment efficiency across cloud, edge, and resource-constrained environments. You will collaborate with globally distributed engineering and research teams focused on advanced AI systems, multi-modal architectures, and infrastructure innovation. The position offers a research-driven environment where experimentation, benchmarking, and performance optimization are central to daily work. Ideal candidates are passionate about low-level optimization, inference scalability, and building robust AI systems that deliver measurable production impact at scale.
- Design, develop, and optimize advanced model serving architectures focused on high throughput, low latency, and efficient memory utilization.
- Build scalable inference pipelines capable of running across cloud, edge, and resource-constrained environments.
- Conduct controlled inference experiments in simulated and production environments to evaluate system performance and reliability.
- Monitor and analyze key performance metrics such as latency, throughput, memory consumption, token response time, and error rates.
- Develop and maintain benchmarking methodologies and performance validation frameworks for AI inference systems.
- Identify bottlenecks in serving pipelines, including batch processing inefficiencies, network overhead, and excessive memory usage.
- Optimize inference frameworks and deployment strategies for scalability, resilience, and operational efficiency.
- Collaborate with cross-functional engineering and research teams to integrate optimized inference solutions into production environments.
- Create high-quality testing datasets and deployment scenarios that reflect real-world operational challenges.
- Continuously improve inference infrastructure through experimentation, iteration, and adoption of cutting-edge AI serving techniques.
Requirements:
- Strong experience in AI/ML engineering with a focus on inference optimization, model serving, or AI systems performance.
- Deep understanding of model deployment architectures and inference frameworks for large-scale AI applications.
- Expertise in optimizing latency, throughput, scalability, and memory footprint in production AI systems.
- Hands-on experience with performance monitoring, benchmarking, profiling, and bottleneck analysis.
- Strong knowledge of advanced AI model architectures, including multi-modal systems and resource-efficient models.
- Experience building and deploying AI systems across cloud, edge, or low-resource hardware environments.
- Proficiency in programming languages commonly used in AI infrastructure and optimization workflows.
- Strong analytical and problem-solving abilities with a research-oriented mindset.
- Ability to work independently in a highly distributed and fast-moving global environment.
- Excellent English communication skills and ability to collaborate across technical and non-technical teams.
- Passion for innovation, experimentation, and scalable AI infrastructure development.
Benefits:
- Fully remote global work environment with flexible location options.
- Opportunity to work on cutting-edge AI, blockchain, and fintech technologies.
- Collaborative international team of highly skilled engineers and researchers.
- Exposure to innovative projects involving AI infrastructure, digital finance, and decentralized technologies.
- High-impact role with significant technical ownership and influence on product direction.
- Fast-paced and innovation-driven culture focused on experimentation and growth.
- Opportunities for continuous learning and professional development.
- Work environment that values autonomy, creativity, and technical excellence.
- Participation in projects with global reach and real-world scalability challenges.