Can I apply directly for this job on this page?

Yes, you can begin your application on this page using a quick form. You'll then be redirected to the employer's career site to complete the full application process.

What is the role of a Sr. AI Data Engineer at Jobgether?

The Sr. AI Data Engineer position at Jobgether is a Full-time or part-time position opportunity in the Information Technology field.

Where is this Sr. AI Data Engineer job located?

United States, Other / Non-US, United States

What type of employment is offered for this Sr. AI Data Engineer role?

Full-time or part-time position

What is the expected salary for this Sr. AI Data Engineer job?

Compensation will be discussed during the hiring process.

Sr. AI Data Engineer job near me in United States, Other / Non-US at Jobgether

Sr. AI Data Engineer

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Sr. AI Data Engineer based in United States.

This role operates at the intersection of data engineering and machine learning systems, building the foundational pipelines that power next-generation generative AI models. You will design and scale complex, AI-augmented data workflows that process billions of images and integrate model-driven enrichment at every stage. The position requires deep expertise in distributed systems, data pipelines, and ML inference orchestration in high-scale environments. You will work on systems that combine traditional SQL-based transformations with real-time model invocations, ensuring quality, reliability, and performance. A key focus of the role is enabling high-quality training datasets for image generation models, directly influencing model performance across multiple dimensions. You will collaborate closely with ML researchers and engineers in a fast-paced, research-driven environment. This is a highly technical and impactful role shaping the future of generative AI infrastructure.

Accountabilities:

Design and maintain large-scale, AI-augmented data pipelines that combine SQL transformations with ML model invocations for data cleaning, labeling, and enrichment.
Own end-to-end remote inference orchestration, including batching, asynchronous execution, retry logic, failure handling, and performance optimization.
Build and manage scalable embedding pipelines, including vector generation, storage, indexing, and similarity search infrastructure.
Curate and govern large-scale training datasets for image generation models using model-driven signals such as classifiers, aesthetic scoring, and content filters.
Develop automated annotation systems using LLMs and vision models, including evaluation frameworks to measure annotation quality and model performance.
Contribute to shared engineering frameworks and reusable tooling for AI-driven data workflows and pipeline orchestration.
Ensure pipeline reliability, compliance, and data quality across billions of records in distributed production systems.
Collaborate with ML researchers and engineers to improve dataset quality, evaluation metrics, and generative model performance.

Requirements:

Bachelor’s degree or higher in Computer Science, Data Engineering, Machine Learning, or a related STEM field.
5+ years of experience in data engineering, ML engineering, or hybrid roles involving data pipelines and model inference systems.
Strong expertise in SQL, data pipeline orchestration tools (e.g., Airflow, Dataswarm), and large-scale distributed systems.
Hands-on experience integrating ML models into production pipelines, including inference APIs, batching, and failure handling.
Experience with AI-assisted development tools (e.g., Copilot, Cursor, Codex) to accelerate engineering workflows.
Strong programming and debugging skills with a focus on scalable data systems and production reliability.
Experience with embeddings, vector databases, or similarity search systems (e.g., FAISS, Milvus) is highly desirable.
Familiarity with content understanding models such as classifiers, OCR, object detection, and NSFW filtering.
Exposure to LLM-based workflows for data annotation, cleaning, or evaluation is strongly preferred.
Knowledge of generative AI concepts such as diffusion models, CLIP scores, and image quality evaluation metrics is a plus.
Strong communication and collaboration skills in cross-functional technical environments.

Benefits:

Competitive annual compensation ranging from $105,000 – $110,000.
Opportunity to work on cutting-edge generative AI infrastructure at massive scale.
Exposure to advanced ML systems, embeddings, and large-scale model orchestration pipelines.
Collaborative environment working closely with research and engineering teams.
Remote flexibility not included; onsite collaboration in a high-performance engineering environment.
Eligibility for standard contractor or temp employee benefits (medical, dental, vision, 401(k), holidays) depending on employment classification and hours.
Opportunity to contribute directly to the development of next-generation image generation models.

How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

#LI-CL1

Sr. AI Data Engineer in United States at Jobgether

Explore Related Opportunities

Job Description

Scan to Apply

Job Location

Frequently asked questions about this position