What is the role of a GenAI Performance and Quality Intern at Modular?

The GenAI Performance and Quality Intern position at Modular is a Full-time or part-time position opportunity in the relevant field.

Where is this GenAI Performance and Quality Intern job located?

Los Altos, California, 94022, United States

What type of employment is offered for this GenAI Performance and Quality Intern role?

Full-time or part-time position

What industry does this GenAI Performance and Quality Intern position belong to?

This role spans multiple industries.

What is the expected salary for this GenAI Performance and Quality Intern job?

Compensation will be discussed during the hiring process.

How can I apply for the GenAI Performance and Quality Intern position at Modular?

You can apply directly through the application link provided.

GenAI Performance and Quality Intern at Modular

About Modular
At Modular, we’re on a mission to revolutionize AI infrastructure by systematically rebuilding the AI software stack from the ground up. Our team, made up of industry leaders and experts, is building cutting-edge, modular infrastructure that simplifies AI development and deployment. By rethinking the complexities of AI systems, we’re empowering everyone to unlock AI’s full potential and tackle some of the world’s most pressing challenges.
If you’re passionate about shaping the future of AI and creating tools that make a real difference in people’s lives, we want you on our team. You can read about our culture and careers to understand how we work and what we value.

What You Will Work On:
As an intern on the Performance and Quality (PAQ) team, you will work on analyzing, optimizing, and safeguarding the performance of large language models and generative AI systems. This includes identifying performance bottlenecks in LLM inference pipelines and developing automation frameworks to streamline performance testing. A key focus will be designing and implementing performance analysis workflows. This includes regression detection pipelines, speed-of-light analyses, and benchmarking across several inference frameworks. You may also contribute to optimizing model serving infrastructure, investigating memory and compute efficiency, or establishing performance baselines and alerting systems. LOCATION: Candidates based in the United States are welcome to apply. To support growth and collaboration, all interns will work in a hybrid capacity at our Los Altos, CA office (minimum 2 days per week on-site) with relocation assistance provided for out-of-state candidates.

What You Will Learn:

Hands-on experience profiling and optimizing LLM inference workloads at scale.
How to design performance regression detection systems and integrate them into CI/CD workflows.
Techniques for building performance analysis tooling, automated benchmarking pipelines, and observability infrastructure for AI systems.
Hands-on experience with GPU/accelerator performance analysis, model inference optimization, and systems-level bottlenecks.
Mentorship from experienced engineers working at the intersection of ML and systems performance.

What you bring to the table:

Currently pursuing a Bachelor's, Master's, or PhD degree in Computer Science, Computer Engineering, or a related field, with graduation expected by Spring 2027 at the latest.
Proficiency in Python; experience with systems performance analysis, C++, or systems-level programming is a strong plus.
Familiarity with profiling tools, benchmarking methodologies, CI/CD systems, or performance optimization techniques. Experience with tools such as NVIDIA Nsight Systems (nsys), Nsight Compute, PyTorch Profiler, Linux perf, or Intel VTune is a plus.
Experience and interest in designing and building automated performance analysis workflows.
Strong problem-solving skills and a passion for building tools and robust workflows to improve system reliability and developer productivity.

What Modular brings to the table:

Amazing Team. We are a progressive and agile team with some of the industry’s best engineering and product leaders.
Competitive Compensation. We offer very strong compensation packages, including stock options. We want people to be focused on their best work and believe in tailoring compensation plans to meet the needs of our workforce.
Team Building Events. We organize regular team onsites and local meetups in Los Altos, CA.

Working at Modular will enable you to grow quickly as you work alongside incredibly motivated and talented people who have high standards, possess a growth mindset, and a purpose to truly change the world. The estimated base hourly range for this role is $47.00 - $65.00 USD. The hourly rate for the successful applicant will depend on a variety of permissible, non-discriminatory job-related factors, which include but are not limited to education, training, work experience, business needs, or market demands. This range may be modified in the future.For candidates who fall outside of the listed requirements, we nevertheless encourage you to apply as we may have openings that are lower/higher level than the ones advertised.

GenAI Performance and Quality Intern at Modular – Los Altos, California

Explore Related Opportunities

About This Position

Scan to Apply

Job Location

Frequently asked questions about this position

Latest Job Openings in California

Referral Response Coordinator

Senior Hydrogeologist

Hospital Case Manager

Material Handler BWN 4:00pm - 4:00am

Maintenance Technician I