AI Data Engineer in India at Jobgether
Explore Related Opportunities
Job Description
This position is posted by Jobgether on behalf of a partner company. We are currently looking for an AI Data Engineer in India.
This role is centered on building and scaling the data infrastructure that powers advanced analytics and AI-driven applications across a global financial environment. You will design robust, distributed data pipelines that enable real-time and batch processing of large, complex datasets. The position plays a critical role in ensuring data is reliable, accessible, and optimized for machine learning and business intelligence use cases. You will collaborate closely with ML engineers and data teams to bridge the gap between raw data and production-ready AI systems. The environment is highly technical and fast-paced, requiring strong ownership and deep engineering expertise. Your work will directly influence how data is structured, governed, and leveraged to support high-impact financial and analytical decision-making.
- Design and implement scalable distributed data pipelines for both batch and real-time data processing.
- Build and maintain data ingestion frameworks using streaming technologies (Kafka, Kinesis) and batch processing tools (Spark, Airflow).
- Develop and optimize data models and warehouse schemas to support analytics and machine learning workloads.
- Create and manage feature stores to enable efficient ML training and inference pipelines.
- Integrate data from multiple internal and external sources into unified, reliable data architectures.
- Implement data quality validation frameworks, monitoring systems, and performance optimization strategies.
- Ensure real-time data availability for AI and analytics-driven applications.
- Collaborate with ML engineers to align data pipelines with model development and deployment needs.
- Enforce data governance, security, and compliance standards across all data systems.
- Strong proficiency in SQL and Python for large-scale data processing and engineering tasks.
- Hands-on experience with big data frameworks such as Apache Spark and Hadoop.
- Solid experience with streaming systems like Kafka and Kinesis.
- Knowledge of ETL/ELT tools including dbt, AWS Glue, and Airflow.
- Strong understanding of cloud platforms (preferably AWS) including S3, EMR, Lambda, and Glue.
- Experience with data warehousing solutions such as Snowflake, Redshift, or BigQuery.
- Solid understanding of data modeling techniques (star schema, normalization, dimensional modeling).
- Experience building scalable distributed data systems and pipelines.
- Exposure to ML pipelines, feature engineering, and data preparation for AI models.
- Familiarity with data governance, lineage, and monitoring best practices.
- Competitive compensation with performance-based incentives.
- Opportunity to work in a global, high-growth financial technology environment.
- Strong career progression opportunities within an international organization.
- Exposure to cutting-edge data engineering and AI-driven systems at scale.
- Collaborative and innovation-focused engineering culture.
- Work environment driven by excellence, ownership, and continuous learning.