Senior Site Reliability Engineer (SRE) / DevOps Engineer
CyLogic
📍 Ashburn, Virginia, US0💼 Tempo pieno🕐 14 giorni fa
Candidati ora →
Crea un account gratis in 30 secondi: ottieni anche il match score AI con il tuo CV.
Descrizione
CyLogic is seeking a highly experienced Senior SRE / DevOps Engineer to build, operate, and improve scalable, reliable, and observable platforms supporting mission-critical applications. This role combines deep expertise in automation, CI/CD, observability, and modern infrastructure technologies with a strong focus on system reliability, performance, and operational excellence.
You will play a key role in advancing platform capabilities, improving developer productivity, and ensuring high availability through established DevOps and SRE practices.
Responsibilities/Duties:
Platform Engineering & Automation
• Build, implement, and maintain scalable infrastructure using Infrastructure as Code (IaC) and automation frameworks
• Develop and optimize CI/CD pipelines (e.g., Jenkins and modern pipeline tooling)
• Automate provisioning, configuration management, and deployment workflows
• Drive adoption of GitOps and immutable infrastructure practices
Observability & Reliability Engineering
• Build and operate observability platforms using OpenSearch, Elasticsearch, and related ecosystems
• Implement centralized logging, metrics, and distributed tracing solutions
• Define and monitor SLIs, SLOs, and SLAs
• Lead root cause analysis, incident response, and postmortem processes
• Perform proactive performance tuning and capacity planning
Containerization & Orchestration
• Build and operate containerized environments using Docker and Kubernetes
• Maintain orchestration platforms supporting microservices-based applications
• Improve cluster performance, scaling, and resiliency
• Apply container security and operational best practices
HashiCorp Stack & Infrastructure Management
• Use Terraform for infrastructure provisioning
• Use Vault for secrets management
• Use Consul (or similar) for service discovery
• Maintain consistency across environments (dev, test, prod)
Monitoring, Logging & APM
• Implement observability solutions using Grafana and telemetry pipelines
• Develop dashboards and alerting strategies to improve system visibility
• Integrate Application Performance Monitoring (APM) tools for application insights
• Establish logging standards and maintain log pipeline reliability
• Other duties as assigned
Experience and Core Competencies:
• 7+ years of experience in SRE, DevOps, or Platform Engineering roles
• Strong hands-on experience with CI/CD pipelines and tools such as Jenkins
• Deep expertise in Elasticsearch / OpenSearch
• Experience with Docker and Kubernetes
• Knowledge of HashiCorp tools (Terraform, Vault, Consul)
• Experience with Grafana and telemetry systems
• Proficiency in Python, Bash, or Go
• Understanding of AWS, Azure, or GCP
• Experience with Linux systems administration
• Experience implementing SRE practices
• Familiarity with service meshes
• Experience with distributed tracing tools
• Knowledge of security best practices
• Exposure to multi-cloud or hybrid environments
• Strong problem-solving and troubleshooting skills
• Ability to operate in high-scale environments
• Strong collaboration skills
• Focus on automation and reliability
Physical Requiremen
• ts:
• Lifting Up to 50 p
• oundsFrequent walking, standing, bending, sit
ting.
TalentyGo è un aggregatore di offerte da fonti pubbliche. Verifica sempre le informazioni direttamente con l'azienda. La candidatura avviene tramite il sito originale dell'azienda; TalentyGo non gestisce processi di selezione.