EOP - Site Reliability Engineer - TS/SCI Required at cFocus Software Incorporated – Washington, District of Columbia
cFocus Software Incorporated
Washington, District of Columbia, 20001, United States
Posted on
Updated on
Recently UpdatedJob Function:EngineeringEmployment Type:Full-Time
Explore Related Opportunities
About This Position
cFocus Software seeks a Site Reliability Engineer to join our program supporting the United States Secret Services (USSS). This position is remote. This position requires the ability a TS/SCI clearance.
Qualifications:
Qualifications:
- Bachelor’s degree in Computer Science, Engineering, or related technical field (or equivalent experience).
- Minimum of 2 years of experience in systems engineering, DevOps, or Site Reliability Engineering roles.
- Strong proficiency with Linux/Unix operating systems.
- Experience with scripting and automation using Python, Bash, or similar languages.
- Experience with monitoring and logging tools such as Prometheus, Grafana, ELK Stack, or equivalent.
- Experience supporting CI/CD tools such as GitLab, Jenkins, or ArgoCD.
- Experience with containerization and orchestration platforms including Docker and Kubernetes.
- Understanding of SRE principles including SLIs, SLOs, and error budgets.
- Strong troubleshooting, problem-solving, and documentation skills.
- Monitor system health, availability, and performance using centralized monitoring and logging tools.
- Respond to, troubleshoot, and resolve incidents in production environments and provide root cause analysis.
- Conduct after-action reporting and post-incident reviews to improve system resilience.
- Automate repetitive operational tasks including deployments, monitoring, and incident response.
- Administer user accounts, access controls, and authentication mechanisms.
- Maintain and configure workflow templates, user fields, and application configurations.
- Maintain test environments that mirror production and support pre-deployment testing.
- Design and maintain backup, high availability (HA), and disaster recovery (DR) solutions.
- Develop and maintain incident response and disaster recovery plans for supported applications.
- Configure and support integrations with complementary enterprise systems.
- Architect, build, and maintain on-premise and cloud infrastructure supporting applications.
- Administer production, staging, and development environments.
- Manage system logs and monitor for security and operational events.
- Maintain and improve CI/CD pipelines and DevSecOps processes.
- Apply configuration management disciplines including patching, hardening, and documentation.
- Create and maintain dashboards, SLIs, SLOs, and service health metrics.
- Support operational readiness boards and weekly service reviews.
- Provide on-call support for outages, upgrades, and emergency maintenance as required.
- Support surge activities, including Presidential Transition-related data analysis if required.
Scan to Apply
Just scan this QR code to apply from your phone.
Job Location
Washington, District of Columbia, 20001, United States
Loading interactive map for Washington, District of Columbia, 20001, United States
Job Location
This job is located in the Washington, District of Columbia, 20001, United States region.
Frequently asked questions about this position
Latest Job Openings in District of Columbia
Retail Baker
Fresh Baguette
Washington, DC
Sales Support
Architectural Ceramics Inc.
Washington, DC
ISE Network Engineer - Top Secret Clearance (SCI Eligible)
JFL Consulting LLC
Washington, DC
Security Detail's - Driver
Olgoonik Corporation
Washington, DC
Licensed Insurance Sales Agent/ Staff Producer
Mike Jones - State Farm Agency
WASHINGTON, DC