GenAI Performance and Quality Intern at Modular – Los Altos, California
Modular
Los Altos, California, 94022, United States
Posted on
Updated on
Explore Related Opportunities
Software and Web Developers, Programmers, and Testers jobs in CaliforniaJobs in CaliforniaSoftware and Web Developers, Programmers, and Testers jobs
About This Position
About Modular
At Modular, we’re on a mission to revolutionize AI infrastructure by systematically rebuilding the AI software stack from the ground up. Our team, made up of industry leaders and experts, is building cutting-edge, modular infrastructure that simplifies AI development and deployment. By rethinking the complexities of AI systems, we’re empowering everyone to unlock AI’s full potential and tackle some of the world’s most pressing challenges.
If you’re passionate about shaping the future of AI and creating tools that make a real difference in people’s lives, we want you on our team. You can read about our culture and careers to understand how we work and what we value.
What You Will Work On:
As an intern on the Performance and Quality (PAQ) team, you will work on analyzing, optimizing, and safeguarding the performance of large language models and generative AI systems. This includes identifying performance bottlenecks in LLM inference pipelines and developing automation frameworks to streamline performance testing. A key focus will be designing and implementing performance analysis workflows. This includes regression detection pipelines, speed-of-light analyses, and benchmarking across several inference frameworks. You may also contribute to optimizing model serving infrastructure, investigating memory and compute efficiency, or establishing performance baselines and alerting systems. LOCATION: Candidates based in the United States are welcome to apply. To support growth and collaboration, all interns will work in a hybrid capacity at our Los Altos, CA office (minimum 2 days per week on-site) with relocation assistance provided for out-of-state candidates.
What You Will Learn:
What you bring to the table:
What Modular brings to the table:
At Modular, we’re on a mission to revolutionize AI infrastructure by systematically rebuilding the AI software stack from the ground up. Our team, made up of industry leaders and experts, is building cutting-edge, modular infrastructure that simplifies AI development and deployment. By rethinking the complexities of AI systems, we’re empowering everyone to unlock AI’s full potential and tackle some of the world’s most pressing challenges.
If you’re passionate about shaping the future of AI and creating tools that make a real difference in people’s lives, we want you on our team. You can read about our culture and careers to understand how we work and what we value.
What You Will Work On:
As an intern on the Performance and Quality (PAQ) team, you will work on analyzing, optimizing, and safeguarding the performance of large language models and generative AI systems. This includes identifying performance bottlenecks in LLM inference pipelines and developing automation frameworks to streamline performance testing. A key focus will be designing and implementing performance analysis workflows. This includes regression detection pipelines, speed-of-light analyses, and benchmarking across several inference frameworks. You may also contribute to optimizing model serving infrastructure, investigating memory and compute efficiency, or establishing performance baselines and alerting systems. LOCATION: Candidates based in the United States are welcome to apply. To support growth and collaboration, all interns will work in a hybrid capacity at our Los Altos, CA office (minimum 2 days per week on-site) with relocation assistance provided for out-of-state candidates.
What You Will Learn:
- Hands-on experience profiling and optimizing LLM inference workloads at scale.
- How to design performance regression detection systems and integrate them into CI/CD workflows.
- Techniques for building performance analysis tooling, automated benchmarking pipelines, and observability infrastructure for AI systems.
- Hands-on experience with GPU/accelerator performance analysis, model inference optimization, and systems-level bottlenecks.
- Mentorship from experienced engineers working at the intersection of ML and systems performance.
What you bring to the table:
- Currently pursuing a Bachelor's, Master's, or PhD degree in Computer Science, Computer Engineering, or a related field, with graduation expected by Spring 2027 at the latest.
- Proficiency in Python; experience with systems performance analysis, C++, or systems-level programming is a strong plus.
- Familiarity with profiling tools, benchmarking methodologies, CI/CD systems, or performance optimization techniques. Experience with tools such as NVIDIA Nsight Systems (nsys), Nsight Compute, PyTorch Profiler, Linux perf, or Intel VTune is a plus.
- Experience and interest in designing and building automated performance analysis workflows.
- Strong problem-solving skills and a passion for building tools and robust workflows to improve system reliability and developer productivity.
What Modular brings to the table:
- Amazing Team. We are a progressive and agile team with some of the industry’s best engineering and product leaders.
- Competitive Compensation. We offer very strong compensation packages, including stock options. We want people to be focused on their best work and believe in tailoring compensation plans to meet the needs of our workforce.
- Team Building Events. We organize regular team onsites and local meetups in Los Altos, CA.
Scan to Apply
Just scan this QR code to apply from your phone.
Job Location
Los Altos, California, 94022, United States
Frequently asked questions about this position
Latest Job Openings in California
Referral Response Coordinator
DCI Donor Services
West Sacramento, CA
Senior Hydrogeologist
Montgomery & Associates
Monterey, CA
Hospital Case Manager
DCI Donor Services
Santa Rosa, CA
Material Handler BWN 4:00pm - 4:00am
B. Braun US Pharmaceutical Manufacturing LLC
Irvine, CA
Maintenance Technician I
Sares-Regis Group
Oakland, CA
Continue to apply
Enter your email to continue. You’ll be redirected to the employer’s application.By clicking Continue, you understand and agree to JobTarget's Terms of Service and Privacy Policy.
Apply Now