Data Engineer in Washington, District of Columbia at Educology Solutions
NewJob Function: Information Technology
Educology Solutions
Washington, District of Columbia, 20001, United States
Posted on
New job! Apply early to increase your chances of getting hired.
Explore Related Opportunities
Database and Network Administrators and Architects jobs near me in Washington, D.C.Jobs near me in Washington, D.C.Database and Network Administrators and Architects jobs
Job Description
Data Engineer – AI/BI
ONLY ACCEPTING CANDIDATES FROM OVERSEAS
Overview:
We are seeking a Data Intelligence Engineer to design, build, and operate a Databricks based Data & AI capabilities with a strong foundation in the Medallion Architecture (raw/bronze, curated/silver, and mart/gold layers). This platform will orchestrate complex data workflows and scalable ELT pipelines to integrate data from enterprise systems such as PeopleSoft, D2L, and Salesforce, delivering high-quality, governed data for machine learning, AI/BI, and analytics at scale.
You will play a critical role in engineering the infrastructure and workflows that enable seamless data flow across the enterprise, power Databricks AI/BI dashboards and Genie experiences, and serve as the backbone for strategic decision-making, predictive modeling, and innovation.
________________________________________
Responsibilities:
1. Data & AI Platform Engineering (Databricks-Centric):
• Build and scale Databricks AI/BI solutions end to end, combing governed semantic models, SQL, and performance optimized query layers.
• Develop and operationalize Databricks Genie experiences by curating datasets, metadata, and prompts for natural language, self-service analytics.
• Design and deliver Databricks dashboards and visual products that translate data into clear actionable insights.
• Design, implement, and optimize end-to-end data pipelines on Databricks, following the Medallion Architecture principles.
• Build robust and scalable ETL/ELT pipelines using Apache Spark and Delta Lake to transform raw (bronze) data into trusted curated (silver) and analytics-ready (gold) data layers.
• Operationalize Databricks Workflows for orchestration, dependency management, and pipeline automation.
• Apply schema evolution and data versioning to support agile data development.
2. Platform Integration & Data Ingestion:
• Connect and ingest data from enterprise systems such as PeopleSoft, D2L, and Salesforce using APIs, JDBC, or other integration frameworks.
• Implement connectors and ingestion frameworks that accommodate structured, semi-structured, and unstructured data.
• Design standardized data ingestion processes with automated error handling, retries, and alerting.
3. Data Quality, Monitoring, and Governance:
• Develop data quality checks, validation rules, and anomaly detection mechanisms to ensure data integrity across all layers.
• Integrate monitoring and observability tools (e.g., Databricks metrics, Grafana) to track ETL performance, latency, and failures.
• Implement Unity Catalog or equivalent tools for centralized metadata management, data lineage, and governance policy enforcement.
4. Security, Privacy, and Compliance:
• Enforce data security best practices including row-level security, encryption at rest/in transit, and fine-grained access control via Unity Catalog.
• Design and implement data masking, tokenization, and anonymization for compliance with privacy regulations (e.g., GDPR, FERPA).
• Work with security teams to audit and certify compliance controls.
5. AI/ML-Ready Data Foundation:
• Enable data scientists by delivering high-quality, feature-rich data sets for model training and inference.
• Support AIOps/MLOps lifecycle workflows using MLflow for experiment tracking, model registry, and deployment within Databricks.
• Collaborate with AI/ML teams to create reusable feature stores and training pipelines.
6. Cloud Data Architecture and Storage:
• Architect and manage data lakes on Azure Data Lake Storage (ADLS) or Amazon S3, and design ingestion pipelines to feed the bronze layer.
• Build data marts and warehousing solutions using platforms like Databricks.
• Optimize data storage and access patterns for performance and cost-efficiency.
7. Documentation & Enablement:
• Maintain technical documentation, architecture diagrams, data dictionaries, and runbooks for all pipelines and components.
• Provide training and enablement sessions to internal stakeholders on the Databricks platform, Medallion Architecture, and data governance practices.
• Conduct code reviews and promote reusable patterns and frameworks across teams.
8. Reporting and Accountability:
• Submit a weekly schedule of hours worked and progress reports outlining completed tasks, upcoming plans, and blockers.
• Track deliverables against roadmap milestones and communicate risks or dependencies.
________________________________________
Required Qualifications:
• Hands-on experience with Databricks (Delta Lake, Apache Spark) and building AI/BI solutions, including dashboards, semantic models, and Genie based natural language analytics.
• Deep understanding of ELT pipeline development, orchestration, and monitoring in cloud-native environments.
• Experience implementing Medallion Architecture (Bronze/Silver/Gold) and working with data versioning and schema enforcement in enterprise grade environments.
• Strong proficiency in SQL, Python, or Scala for data transformations and workflow logic.
• Proven experience integrating enterprise platforms (e.g., PeopleSoft, Salesforce, D2L) into centralized data platforms.
• Familiarity with data governance, lineage tracking, and metadata management tools.
________________________________________
Preferred Qualifications:
• Experience with Databricks Unity Catalog for metadata management and access control.
• Experience deploying ML models at scale using MLFlow or similar MLOps tools.
• Familiarity with cloud platforms like Azure or AWS, including storage, security, and networking aspects.
• Knowledge of data warehouse design and star/snowflake schema modeling.
• UMGC or USM prior experience preferred.
ONLY ACCEPTING CANDIDATES FROM OVERSEAS
Overview:
We are seeking a Data Intelligence Engineer to design, build, and operate a Databricks based Data & AI capabilities with a strong foundation in the Medallion Architecture (raw/bronze, curated/silver, and mart/gold layers). This platform will orchestrate complex data workflows and scalable ELT pipelines to integrate data from enterprise systems such as PeopleSoft, D2L, and Salesforce, delivering high-quality, governed data for machine learning, AI/BI, and analytics at scale.
You will play a critical role in engineering the infrastructure and workflows that enable seamless data flow across the enterprise, power Databricks AI/BI dashboards and Genie experiences, and serve as the backbone for strategic decision-making, predictive modeling, and innovation.
________________________________________
Responsibilities:
1. Data & AI Platform Engineering (Databricks-Centric):
• Build and scale Databricks AI/BI solutions end to end, combing governed semantic models, SQL, and performance optimized query layers.
• Develop and operationalize Databricks Genie experiences by curating datasets, metadata, and prompts for natural language, self-service analytics.
• Design and deliver Databricks dashboards and visual products that translate data into clear actionable insights.
• Design, implement, and optimize end-to-end data pipelines on Databricks, following the Medallion Architecture principles.
• Build robust and scalable ETL/ELT pipelines using Apache Spark and Delta Lake to transform raw (bronze) data into trusted curated (silver) and analytics-ready (gold) data layers.
• Operationalize Databricks Workflows for orchestration, dependency management, and pipeline automation.
• Apply schema evolution and data versioning to support agile data development.
2. Platform Integration & Data Ingestion:
• Connect and ingest data from enterprise systems such as PeopleSoft, D2L, and Salesforce using APIs, JDBC, or other integration frameworks.
• Implement connectors and ingestion frameworks that accommodate structured, semi-structured, and unstructured data.
• Design standardized data ingestion processes with automated error handling, retries, and alerting.
3. Data Quality, Monitoring, and Governance:
• Develop data quality checks, validation rules, and anomaly detection mechanisms to ensure data integrity across all layers.
• Integrate monitoring and observability tools (e.g., Databricks metrics, Grafana) to track ETL performance, latency, and failures.
• Implement Unity Catalog or equivalent tools for centralized metadata management, data lineage, and governance policy enforcement.
4. Security, Privacy, and Compliance:
• Enforce data security best practices including row-level security, encryption at rest/in transit, and fine-grained access control via Unity Catalog.
• Design and implement data masking, tokenization, and anonymization for compliance with privacy regulations (e.g., GDPR, FERPA).
• Work with security teams to audit and certify compliance controls.
5. AI/ML-Ready Data Foundation:
• Enable data scientists by delivering high-quality, feature-rich data sets for model training and inference.
• Support AIOps/MLOps lifecycle workflows using MLflow for experiment tracking, model registry, and deployment within Databricks.
• Collaborate with AI/ML teams to create reusable feature stores and training pipelines.
6. Cloud Data Architecture and Storage:
• Architect and manage data lakes on Azure Data Lake Storage (ADLS) or Amazon S3, and design ingestion pipelines to feed the bronze layer.
• Build data marts and warehousing solutions using platforms like Databricks.
• Optimize data storage and access patterns for performance and cost-efficiency.
7. Documentation & Enablement:
• Maintain technical documentation, architecture diagrams, data dictionaries, and runbooks for all pipelines and components.
• Provide training and enablement sessions to internal stakeholders on the Databricks platform, Medallion Architecture, and data governance practices.
• Conduct code reviews and promote reusable patterns and frameworks across teams.
8. Reporting and Accountability:
• Submit a weekly schedule of hours worked and progress reports outlining completed tasks, upcoming plans, and blockers.
• Track deliverables against roadmap milestones and communicate risks or dependencies.
________________________________________
Required Qualifications:
• Hands-on experience with Databricks (Delta Lake, Apache Spark) and building AI/BI solutions, including dashboards, semantic models, and Genie based natural language analytics.
• Deep understanding of ELT pipeline development, orchestration, and monitoring in cloud-native environments.
• Experience implementing Medallion Architecture (Bronze/Silver/Gold) and working with data versioning and schema enforcement in enterprise grade environments.
• Strong proficiency in SQL, Python, or Scala for data transformations and workflow logic.
• Proven experience integrating enterprise platforms (e.g., PeopleSoft, Salesforce, D2L) into centralized data platforms.
• Familiarity with data governance, lineage tracking, and metadata management tools.
________________________________________
Preferred Qualifications:
• Experience with Databricks Unity Catalog for metadata management and access control.
• Experience deploying ML models at scale using MLFlow or similar MLOps tools.
• Familiarity with cloud platforms like Azure or AWS, including storage, security, and networking aspects.
• Knowledge of data warehouse design and star/snowflake schema modeling.
• UMGC or USM prior experience preferred.
Scan to Apply
Just scan this QR code to apply from your phone.
Job Location
Washington, District of Columbia, 20001, United States
Frequently asked questions about this position
Similar Jobs In Washington, District of Columbia
Hot Job
ISE Network Engineer - Top Secret Clearance (SCI Eligible)
JFL Consulting LLC
Washington, District of Columbia
Urgently Hiring
Program Chief Engineer
FIBERTEK, INC.
Herndon, Virginia
New
CDAO Advana - AWS Cloud Engineer
General Dynamics Information Technology
Washington, District of Columbia
New
Senior Data Governance & Quality Analyst
General Dynamics Information Technology
Falls Church, Virginia
New
26-1102: ServiceNow Admin - DC Metro
Navitas Business Consulting
Herndon, Virginia
Continue to apply
Enter your email to continue. You’ll be redirected to the employer’s application.By clicking Continue, you understand and agree to JobTarget's Terms of Use and Privacy Policy.