Can I apply directly for this job on this page?

Yes, you can begin your application on this page using a quick form. You'll then be redirected to the employer's career site to complete the full application process.

What is the role of a AI Data Infrastructure Engineer at Jobgether?

The AI Data Infrastructure Engineer position at Jobgether is a Full-time or part-time position opportunity in the Engineering field.

Where is this AI Data Infrastructure Engineer job located?

United States, Other / Non-US, United States

What type of employment is offered for this AI Data Infrastructure Engineer role?

Full-time or part-time position

What is the expected salary for this AI Data Infrastructure Engineer job?

Compensation will be discussed during the hiring process.

AI Data Infrastructure Engineer job near me in United States, Other / Non-US at Jobgether

AI Data Infrastructure Engineer

This position is posted by Jobgether on behalf of a partner company. We are currently looking for an AI Data Infrastructure Engineer in the United States.

This role focuses on designing, building, and operating large-scale data systems that power modern AI training and evaluation workflows. You will work on complex, high-throughput data infrastructures that support multimodal datasets and ensure high-quality data delivery for machine learning pipelines. The position combines deep data engineering expertise with a strong understanding of AI system requirements, including scalability, reliability, and performance optimization. You will contribute to building ingestion, transformation, validation, and dataset management systems that directly influence model quality and training efficiency. Working in a highly technical environment, you will collaborate with ML engineers and researchers to align data architecture with evolving AI needs. This is a hands-on, impactful role ideal for engineers passionate about large-scale systems and cutting-edge AI infrastructure.

Accountabilities:

Design, build, and maintain large-scale data pipelines supporting AI training, evaluation, and continuous model improvement workflows.
Develop ingestion and processing systems for multimodal datasets including text, image, audio, video, and structured data.
Implement data cleaning, deduplication, validation, and quality assurance processes at petabyte-scale.
Build dataset versioning, lineage tracking, and reproducibility systems to ensure reliable AI training environments.
Optimize high-throughput data delivery systems to maximize compute and GPU utilization.
Collaborate with ML researchers and engineers to support dataset construction, evaluation pipelines, and AI model development needs.
Design scalable storage architectures and implement observability tools for data quality, performance, and pipeline health.
Ensure data governance, privacy compliance, and secure handling of sensitive datasets across systems.

Requirements:

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field.
6+ years of experience in data engineering, preferably supporting machine learning or AI systems.
Strong proficiency in Python and at least one systems or JVM-based language (e.g., Java, Scala, Go).
Hands-on experience with distributed data processing frameworks such as Spark, Beam, or Ray.
Experience operating large-scale or petabyte-level data infrastructure systems.
Strong understanding of distributed systems, data modeling, storage formats, and pipeline architecture.
Experience with dataset versioning, lineage tracking, and ML reproducibility workflows.
Strong software engineering practices including testing, CI/CD, and system design.
Excellent communication skills and ability to work cross-functionally with technical teams.
Experience with multimodal datasets, privacy-aware systems, or AI training pipelines is a plus.

Benefits:

Competitive salary aligned with experience and expertise (W2 employment).
Full-time, long-term remote position within the United States.
Comprehensive benefits package (health, dental, vision, and wellness support).
401(k) retirement savings plan and financial security programs.
Paid time off, holidays, and work-life balance support.
Opportunity to work on cutting-edge AI infrastructure and large-scale data systems.
Professional growth in advanced AI, distributed systems, and data engineering domains.

How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

#LI-CL1

AI Data Infrastructure Engineer in United States at Jobgether

Explore Related Opportunities

Job Description

Scan to Apply

Job Location

Frequently asked questions about this position