JobTarget Logo

Senior AI Data Engineer in United States at Jobgether

NewJob Function: Information Technology
Jobgether
United States, United States
Posted on
New job! Apply early to increase your chances of getting hired.

Explore Related Opportunities

Job Description

Senior AI Data Engineer

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Senior AI Data Engineer based in the United States.

This is a high-impact, AI-first engineering role focused on building and operating the data infrastructure that powers large-scale public data aggregation and insight generation. You will own end-to-end systems spanning data acquisition, transformation, serving, and reporting, with a strong emphasis on automation, resilience, and self-healing pipelines. Rather than manually maintaining brittle scrapers, you will design intelligent systems that leverage LLMs and agentic workflows to detect, diagnose, and repair failures autonomously. The role combines deep data engineering ownership with modern AI-native development practices, including LLM-driven parsing, anomaly detection, and natural language data interfaces. You will also contribute to building scalable reporting layers and production-grade data services that power real-time insights. Working closely with senior engineering and product leadership, you will help shape a system where AI and data infrastructure operate seamlessly together in production.

Accountabilities
  • Own the end-to-end design, development, and reliability of large-scale data acquisition systems, including web scraping infrastructure and automated data pipelines.
  • Build and maintain self-healing scraper systems that use LLMs and agentic workflows to detect, diagnose, and automatically recover from failures.
  • Ensure daily data ingestion pipelines remain stable through monitoring, alerting, retry logic, and robust failure handling mechanisms.
  • Develop AI-assisted parsing and entity extraction systems to handle unstructured or frequently changing web data.
  • Own the data serving layer and ETL/ELT pipelines powering analytics and BigQuery-based data warehouses.
  • Design and implement reporting systems, including data models, transformations, dashboards, and AI-driven narrative insights.
  • Apply rule-based and ML/LLM-based techniques for data quality monitoring, anomaly detection, and system reliability.
  • Build and maintain production-grade MCP servers and agentic workflows for internal and AI-driven data consumption.
  • Collaborate with engineering, product, and leadership teams to define system architecture and ensure long-term maintainability.
  • Document systems, best practices, and operational workflows to support scalable human-in-the-loop AI operations.
Requirements
  • 6+ years of experience in data engineering with ownership of production-grade, mission-critical systems.
  • Strong proficiency in Python with hands-on experience building and maintaining large-scale web scraping systems (Scrapy, Playwright, Selenium, BeautifulSoup).
  • Proven experience designing and deploying LLM-powered or agentic systems in production environments.
  • Strong understanding of prompt engineering, LLM evaluation, observability, and AI system performance trade-offs (latency, cost, quality, reliability).
  • Experience building data modeling, transformation pipelines (e.g., dbt), and BI/reporting layers.
  • Strong expertise in SQL and hands-on experience with the GCP ecosystem (BigQuery, Cloud Composer, Cloud Storage, Cloud Run/GKE).
  • Familiarity with Docker and production system design for scalable data infrastructure.
  • Strong reliability mindset with proven ownership of uptime, incident response, and production system stability.
  • Understanding of legal and ethical considerations in large-scale web scraping and data acquisition.
  • Experience working with AI-assisted development tools (e.g., Claude, Cursor) is highly desirable.
  • Bonus: experience with ML model deployment, distributed systems, Terraform, Pub/Sub, or large-scale data processing frameworks.
Benefits
  • Remote position based in the United States.
  • Competitive compensation package with base salary and bonus ($190K–$210K base range, depending on experience).
  • Full benefits package including medical, dental, vision, life, and disability insurance.
  • 401(k) retirement plan and paid time off.
  • Opportunity to work on cutting-edge AI-first data systems at scale.
  • High ownership role with direct impact on production infrastructure and product outcomes.
  • Exposure to modern LLM-driven engineering practices and agentic system design.
  • Fast-paced, high-growth environment combining startup innovation with enterprise stability.
How Jobgether works:
We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
#LI-CL1

Job Location

United States, United States

Frequently asked questions about this position

Continue to apply
Enter your email to continue. You’ll be redirected to the employer’s application.
By clicking Continue, you understand and agree to JobTarget's Terms of Use and Privacy Policy.