Scientific Games logo

Sr. Site Reliability Engineer (SRE)

Scientific Games
Full-time
On-site
Alpharetta, Georgia, United States
Software Engineering & Technology

Scientific Games:

Scientific Games is the global leader in lottery games, sports betting and technology, and the partner of choice for government lotteries. From cutting-edge backend systems to exciting entertainment experiences and trailblazing retail and digital solutions, we elevate play every day. We push game designs to the next level and are pioneers in data analytics and iLottery. Built on a foundation of trusted partnerships, Scientific Games combines relentless innovation, legendary performance, and unwavering security to responsibly propel the global lottery industry ever forward.

Position Summary

We are looking for a skilled Site Reliability Engineer (SRE) to enhance the stability, performance, and reliability of our production systems. The SRE will work closely with development, DevOps, and security teams, ensuring production readiness, managing on-call responsibilities, and improving observability across applications and infrastructure.

  • Monitoring & Observability
    • Maintain and enhance observability using New Relic, Graylog, OR other monitoring tools.
    • Establish actionable alerting and dashboards for service health and performance metrics.
  • Reliability Engineering
    • Implement and maintain reliable systems, focusing on capacity planning, performance optimization, and fault tolerance to ensure high availability and scalability.
    • Collaborate with teams to define and implement Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs), and monitor their performance.
  • Automation & Infrastructure Operations
    • Automate operational processes, reducing manual interventions.
    • Manage Kubernetes workloads on AWS EKS, ensuring secure and stable deployments.
    • Work with HashiCorp Vault for secrets management and security compliance.
  • Incident & Problem Management
    • Participate in on-call rotation to handle production incidents and ensure rapid resolution.
    • Troubleshoot production issues, identify root causes, and implement permanent fixes.
    • Lead post-incident reviews, create action items, and follow through on remediation.
  • Collaboration
    • Work closely with DevOps to improve CI/CD pipelines for production readiness.
    • Partner with development teams to embed resilience and observability into applications.
  • Documentation & Knowledge Sharing
    • Document operational runbooks, escalation procedures, and production playbooks.

Qualifications

Required Skills

  • Bachelor’s degree in computer science or related field, or equivalent work experience.
  • Experience: 6+ years as an SRE, DevOps Engineer, or similar role
  • Cloud: Strong experience with AWS (EKS, EC2, S3, Route53, IAM)
  • Kubernetes: 6+ years managing production Kubernetes workloads
  • Monitoring & Observability: Hands-on with New Relic, Graylog, or similar
  • Secrets Management: Experience with HashiCorp Vault or equivalent
  • Automation & CI/CD: Proficiency with GitHub Actions, GitLab CI/CD, Helm and ArgoCD
  • IaC : Hands-on experience with Terraform
  • Scripting: Proficiency in Python, Bash, or equivalent scripting languages
  • Incident Management: Strong debugging, troubleshooting, and root cause analysis skills
  • On-Call Readiness: Willingness to participate in 24x7 on-call rotation

Desired Skills

  • AWS certification
  • Familiarity with .NET application stack
  • Multi-cloud exposure
  • Experience managing Kubernetes clusters with Rancher in on-prem environments
  • Familiarity with Packer for building Golden AMIs

Physical Requirements

The physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions. While performing the duties of this job, the employee is regularly required to sit, stand, walk, bend, use hands, operate a computer, and have specific vision abilities to include close and distance vision, and ability to adjust focus working with computer and business equipment.


Work Conditions

Scientific Games Corporation and its affiliates (collectively, “SG”) are engaged in highly regulated gaming and lottery businesses.   As a result, certain SG employees may, among other things, be required to obtain a gaming or other license(s), undergo background investigations or security checks, or meet certain standards dictated by law, regulation or contracts.   In order to ensure SG complies with its regulatory and contractual commitments, as a condition to hiring and continuing to employ its employees, SG requires all of its employees to meet those requirements that are necessary to fulfill their individual roles.  As a prerequisite to employment with SG (to the extent permitted by law), you shall be asked to consent to SG conducting a due diligence/background investigation on you.

This job description should not be interpreted as all-inclusive; it is intended to identify major responsibilities and requirements of the job. The employee in this position may be requested to perform other job-related tasks and responsibilities than those stated above. 

SG is an Equal Opportunity Employer and does not discriminate against applicants due to race, color, sex, age, national origin, religion, sexual orientation, gender identity, status as a veteran, and basis of disability or any other federal, state or local protected class. If you’d like more information about your equal employment opportunity rights as an applicant under the law, please click here for EEOC Poster.