JobTarget Logo

AI Data Infrastructure Engineer in United States at Jobgether

NewJob Function: Engineering
Jobgether
United States, United States
Posted on
New job! Apply early to increase your chances of getting hired.

Explore Related Opportunities

Job Description

AI Data Infrastructure Engineer

This position is posted by Jobgether on behalf of a partner company. We are currently looking for an AI Data Infrastructure Engineer in the United States.

This role focuses on designing, building, and operating large-scale data systems that power modern AI training and evaluation workflows. You will work on complex, high-throughput data infrastructures that support multimodal datasets and ensure high-quality data delivery for machine learning pipelines. The position combines deep data engineering expertise with a strong understanding of AI system requirements, including scalability, reliability, and performance optimization. You will contribute to building ingestion, transformation, validation, and dataset management systems that directly influence model quality and training efficiency. Working in a highly technical environment, you will collaborate with ML engineers and researchers to align data architecture with evolving AI needs. This is a hands-on, impactful role ideal for engineers passionate about large-scale systems and cutting-edge AI infrastructure.

Accountabilities:
  • Design, build, and maintain large-scale data pipelines supporting AI training, evaluation, and continuous model improvement workflows.
  • Develop ingestion and processing systems for multimodal datasets including text, image, audio, video, and structured data.
  • Implement data cleaning, deduplication, validation, and quality assurance processes at petabyte-scale.
  • Build dataset versioning, lineage tracking, and reproducibility systems to ensure reliable AI training environments.
  • Optimize high-throughput data delivery systems to maximize compute and GPU utilization.
  • Collaborate with ML researchers and engineers to support dataset construction, evaluation pipelines, and AI model development needs.
  • Design scalable storage architectures and implement observability tools for data quality, performance, and pipeline health.
  • Ensure data governance, privacy compliance, and secure handling of sensitive datasets across systems.
Requirements:
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field.
  • 6+ years of experience in data engineering, preferably supporting machine learning or AI systems.
  • Strong proficiency in Python and at least one systems or JVM-based language (e.g., Java, Scala, Go).
  • Hands-on experience with distributed data processing frameworks such as Spark, Beam, or Ray.
  • Experience operating large-scale or petabyte-level data infrastructure systems.
  • Strong understanding of distributed systems, data modeling, storage formats, and pipeline architecture.
  • Experience with dataset versioning, lineage tracking, and ML reproducibility workflows.
  • Strong software engineering practices including testing, CI/CD, and system design.
  • Excellent communication skills and ability to work cross-functionally with technical teams.
  • Experience with multimodal datasets, privacy-aware systems, or AI training pipelines is a plus.
Benefits:
  • Competitive salary aligned with experience and expertise (W2 employment).
  • Full-time, long-term remote position within the United States.
  • Comprehensive benefits package (health, dental, vision, and wellness support).
  • 401(k) retirement savings plan and financial security programs.
  • Paid time off, holidays, and work-life balance support.
  • Opportunity to work on cutting-edge AI infrastructure and large-scale data systems.
  • Professional growth in advanced AI, distributed systems, and data engineering domains.
How Jobgether works:
We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
#LI-CL1

Job Location

United States, United States

Frequently asked questions about this position

Continue to apply
Enter your email to continue. You’ll be redirected to the employer’s application.
By clicking Continue, you understand and agree to JobTarget's Terms of Use and Privacy Policy.