Lead Data Engineer (GenAI / LLM Applications) in India at Jobgether
Explore Related Opportunities
Job Description
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Lead Data Engineer (GenAI / LLM Applications) in India.
This role sits at the intersection of advanced data engineering, software development, and applied artificial intelligence, with a strong focus on building scalable, production-ready data platforms and GenAI-powered solutions. You will design and deliver end-to-end data pipelines, intelligent workflows, and LLM-based applications that support clinical research and business intelligence use cases. The position involves working across the full SDLC, collaborating with cross-functional teams including data scientists, analysts, and product managers to translate complex requirements into robust technical solutions. You will also play a key role in integrating modern AI tools and frameworks such as RAG pipelines, agents, and AWS Bedrock-based services into enterprise-grade systems. Operating in a global, impact-driven environment, you will help ensure data reliability, scalability, and performance across critical platforms. This is a high-ownership role suited for someone passionate about AI innovation and large-scale data systems.
- Design, build, and maintain scalable data architectures, pipelines, and software systems supporting analytical and operational workloads.
- Develop high-quality Python-based solutions using frameworks such as Flask, Django, and data libraries including pandas, NumPy, and Plotly.
- Build and integrate GenAI and LLM-powered applications, including RAG pipelines, intelligent agents, and automated workflows using tools like LangChain and AWS Bedrock.
- Design and optimize complex SQL queries and database structures across platforms such as Oracle, MS SQL Server, PostgreSQL, and Snowflake.
- Develop and manage ETL pipelines, data models, and transformation workflows using Snowflake and related technologies.
- Implement orchestration and scheduling frameworks using Apache Airflow or similar tools to ensure reliable data processing.
- Ensure data quality, governance, versioning, and compliance across structured and unstructured datasets.
- Deploy, monitor, and maintain cloud-based solutions on AWS, ensuring scalability, reliability, and performance.
- Collaborate with stakeholders to gather requirements, translate business needs into technical designs, and deliver well-documented solutions.
- Troubleshoot production issues and continuously improve system performance, stability, and efficiency.
- Bachelor’s or Master’s degree in Computer Science, Information Technology, or related field.
- 5+ years of experience in data engineering, software engineering, or data-focused development roles.
- Strong proficiency in Python and related frameworks (Flask/Django) along with data libraries such as pandas, NumPy, and Plotly.
- Advanced SQL expertise across multiple databases (Oracle, MS SQL Server, PostgreSQL, Snowflake), including performance tuning and complex query design.
- Hands-on experience with cloud platforms, preferably AWS (S3, EC2, Lambda, Secrets Manager, Bedrock).
- Proven experience building or integrating GenAI/LLM applications using tools like LangChain, GitHub Copilot, or similar frameworks.
- Strong understanding of data modeling concepts for structured and unstructured data.
- Experience with CI/CD pipelines and Git-based version control systems.
- Strong analytical thinking, problem-solving ability, and communication skills.
- Ability to work across the full SDLC, from requirements gathering to production support.
- Exposure to data visualization tools (Power BI, Plotly) or clinical data environments is a plus.
- Competitive compensation aligned with local market standards.
- Comprehensive health and wellness benefits.
- Paid time off and company holidays.
- Professional development and continuous learning opportunities.
- Flexible work arrangement with remote or Bangalore-based options.
- Exposure to cutting-edge AI, LLM, and data engineering technologies in a global environment.
- Opportunity to contribute to impactful work in clinical research and data-driven healthcare innovation.