Data Engineer at Sonatype – Columbia, Missouri
Sonatype
Columbia, Missouri, 65201, United States
Posted on
Updated on
Job Function:Information Technology
About This Position
Data Engineer
Sonatype is the software supply chain security company. We provide the worlds best end-to-end software supply chain security solution, combining the only proactive protection against malicious open source, the only enterprise grade SBOM management and the leading open source dependency management platform. This empowers enterprises to create and maintain secure, quality, and innovative software at scale.
As founders of Nexus Repository and stewards of Maven Central, the worlds largest repository of Java open-source software, we are software pioneers and our open source expertise is unmatched. We empower innovation with an unparalleled commitment to build faster, safer software and harness AI and data intelligence to mitigate risk, maximize efficiencies, and drive powerful software development.
More than 2,000 organizations, including 70% of the Fortune 100 and 15 million software developers, rely on Sonatype to optimize their software supply chains.
Were looking for a Data Engineer to join our growing Data Platform team. Youll play a key role in designing and scaling the infrastructure and pipelines that power the product features, analytics, and machine learning across Sonatype.
Youll work closely with stakeholders across product, engineering, and business teams to ensure data is reliable, accessible, and actionable. This role is ideal for someone who thrives on solving complex data challenges at scale and enjoys building high-quality, maintainable systems.What you'll do:
Sonatype is the software supply chain security company. We provide the worlds best end-to-end software supply chain security solution, combining the only proactive protection against malicious open source, the only enterprise grade SBOM management and the leading open source dependency management platform. This empowers enterprises to create and maintain secure, quality, and innovative software at scale.
As founders of Nexus Repository and stewards of Maven Central, the worlds largest repository of Java open-source software, we are software pioneers and our open source expertise is unmatched. We empower innovation with an unparalleled commitment to build faster, safer software and harness AI and data intelligence to mitigate risk, maximize efficiencies, and drive powerful software development.
More than 2,000 organizations, including 70% of the Fortune 100 and 15 million software developers, rely on Sonatype to optimize their software supply chains.
Were looking for a Data Engineer to join our growing Data Platform team. Youll play a key role in designing and scaling the infrastructure and pipelines that power the product features, analytics, and machine learning across Sonatype.
Youll work closely with stakeholders across product, engineering, and business teams to ensure data is reliable, accessible, and actionable. This role is ideal for someone who thrives on solving complex data challenges at scale and enjoys building high-quality, maintainable systems.What you'll do:
- Design, build, and maintain scalable data pipelines and ETL processes
- Architect and optimize data models and storage solutions for analytics and operational use
- Collaborate with other data engineers to deliver trusted, high-quality datasets
- Own and evolve parts of our data platform, specifically the streaming pipeline and Data Lake
- Implement observability, alerting, and data quality monitoring for critical pipelines
- Drive best practices in data engineering, including documentation, testing, and CI/CD
- Contribute to the design and evolution of our next-generation data lakehouse architecture
- 4+ years of experience as a Data Engineer or Backend engineering role
- Strong programming skills in Java and Python
- Proficient in writing complex SQL and optimizing queries for performance
- Proficient in English, and strong communication skills, including the ability to speak to other engineers, analysts, and demo or explain new features to non-engineers
- Some experience using AWS cloud-native tools, like S3, SNS, SQS, EC2, or EMR
- Familiarity with streaming data pipelines or real-time processing
- Hands-on experience with distributed data tools like Hadoop, HDFS, and Spark
- Know your way around Docker containers and the Linux command line
- Exposure to DynamoDB or similar NoSQL data stores
- Experience using Databricks to write queries and notebooks
- Experience supporting data products in production
- An understanding of data privacy, security, and compliance best practices
- Data with purpose: Work on problems that directly impact how the world builds secure software
- Modern tooling: Leverage the best of open-source and cloud-native technologies, including very modern versions of Java
- Collaborative culture: Join a passionate team that values learning, autonomy, and impact
Scan to Apply
Just scan this QR code to apply from your phone.
Job Location
Columbia, Missouri, 65201, United States
Frequently asked questions about this position
Latest Job Openings in Missouri
Medical Assistant(MA)/Licensed Practical Nurse (LPN) - Family Practice
Advocates For A Healthy Community Inc.
Springfield, MO
CDL-A - New pay increase - Team Van Truckload truck driver
Schneider
Fulton, MO
CDL-A - Regional Van Truckload truck driver - home weekly
Schneider
Jefferson City, MO
RN/LPN 12 hr Days Baylor Pay + Great Team!
Friendship Village
St. Louis, MO
CDL-A - Intermodal truck driver
Schneider
Saint Louis, MO