What is the role of a Machine Learning Systems Engineer at Voltai?

The Machine Learning Systems Engineer position at Voltai is a Full-Time opportunity in the relevant field.

Where is this Machine Learning Systems Engineer job located?

Menlo Park, CA, California, United States

What industry does this Machine Learning Systems Engineer position belong to?

This role spans multiple industries.

What is the expected salary for this Machine Learning Systems Engineer job?

Compensation will be discussed during the hiring process.

How can I apply for the Machine Learning Systems Engineer position at Voltai?

You can apply directly through the application link provided.

Machine Learning Systems Engineer at Voltai

About Voltai

Voltai is the leading AI company building agentic systems and frontier foundation models for semiconductor and electronics design. Backed by Sequoia Capital, we’re putting AI in the hands of hardware engineers in over 70% of the world’s largest semiconductor and electronics companies to have effortless control over their next-generation chip and board designs, powering the future of automotive, industrial automation, consumer electronics, IoT, and semiconductor manufacturing.

About the Team

Our founding team consists of IOI/IPhO olympiad medalists, Stanford professors, ex-CTO of Synopsys, and our business leadership has scaled revenue in their previous companies to over $1.5bn. At Voltai, we are combining the world’s best talent in the intersection of software and hardware.

Key Responsibilities

Design and maintain high-performance ML pipelines for training, evaluation, and inference of LLMs and retrieval-augmented systems, with a focus on hardware efficiency and throughput
Optimize core transformer operations at the kernel level, designing and tuning custom kernels and low-level implementations for GPU-accelerated workloads
Implement and integrate low-precision computation techniques to reduce memory footprint and accelerate inference with minimal accuracy degradation
Build and maintain inference engines for on premises deployments
Architect distributed training and inference systems
Collaborate closely with researchers and infra teams to bring cutting-edge model innovations into production
Interface directly with enterprise hardware environments, tuning performance based on real-world deployment constraints

Required Skill Sets

Languages: Expertise in C, C++, or Rust
Design and Optimize CUDA Kernels for LLMs: Develop and fine-tune custom CUDA kernels to accelerate core transformer operations
Implement Low-Precision Computation Techniques: Apply quantization methods like AWQ and GPTQ to reduce model size and inference latency. Ensure minimal accuracy loss while maximizing throughput on GPU architectures with familiarity with concepts like GGUF and GGML
Develop and Maintain High-Performance Inference Systems: Build, improve, and maintain inference engines such as vLLM, SGLang, and TensorRT with a focus on low-latency and high throughput
Architect Distributed Training and Inference Solutions: Design systems that support model parallelism (tensor, pipeline, expert etc) to enable efficient training and inference across multiple GPUs and nodes
Integrate Research into Production Systems: Translate cutting-edge research findings into robust, production-ready systems. Ensure that innovations in model architectures and optimization techniques are effectively deployed.
Monitor and Optimize System Performance: Implement monitoring tools to track system metrics, identify bottlenecks, and optimize performance

Bonus Points

Some background in hardware/electronics, gained through professional, academic, or personal projects
Contributions to open-source initiatives
Notable awards or publications in leading journals/conferences
Experience thriving in a fast-paced, hyper-growth startup environment

Our Benefits

Unlimited PTO: Recharge when you need it, no questions asked.
Comprehensive Health Coverage: Medical, dental, and vision insurance for you and your dependents.
Free Meals and Snacks: Daily lunches, dinners, and snacks in the office.
Professional Growth: We invest in your continuous learning and offer opportunities to expand your skills.
Visa Sponsorship: We welcome global talent and provide visa sponsorship to support qualified candidates.

Machine Learning Systems Engineer at Voltai – Menlo Park, CA, California

Explore Related Opportunities

About This Position

Scan to Apply

Job Location

Frequently asked questions about this position

Latest Job Openings in California

Staff Nurse - Marina Harbor Detox

Lead Dentist

Birth Tissue Coordinator

Community Support Facilitator

NP/PA Family Medicine Clinic - TrueCare San Marcos