At DraftKings, AI is becoming an integral part of both our present and future, powering how work gets done today, guiding smarter decisions, and sparking bold ideas. It’s transforming how we enhance customer experiences, streamline operations, and unlock new possibilities. Our teams are energized by innovation and readily embrace emerging technology. We’re not waiting for the future to arrive. We’re shaping it, one bold step at a time. To those who see AI as a driver of progress, come build the future together.
As a Senior Site Reliability Engineer, you'll build and scale the critical infrastructure behind every product. In this role, you'll take on complex challenges across global data centers, multiple cloud platforms, and on-premise systems—designing automation-first solutions that elevate performance and eliminate operational friction. You'll be trusted to drive stability at scale, influence architectural decisions, and build tools that empower our teams to move fast and deliver reliably. This is where your impact won't just be felt, it'll be foundational.
Drive stability and scalability across our global compute platform spanning numerous data centers, multiple public clouds, and on-premise environments, serving as the foundation for all our products.
Implement automation for self-healing, fault-tolerant infrastructure using declarative configurations and event-driven workflows, and develop internal tools to eliminate repetitive tasks.
Establish critical performance and reliability metrics for infrastructure platform components.
Ensure the highest level of uptime and Quality of Service (QoS) for internal customers through operational excellence.
Support technical growth by sharing knowledge, participating in design discussions, and contributing to a collaborative team culture.
Participate in an on-call rotation, incident reviews, root cause identification, and Root Cause Analysis (RCA) reporting.
Bachelor's degree in Computer Science or relevant education, experience, and training.
At least 4 years of experience managing distributed cloud environments such as GCP, AWS, vSphere, and Nutanix, along with platform automation at scale.
Deep expertise in container orchestration with Kubernetes with the ability to design, scale, and troubleshoot complex workloads.
Strong experience developing software for automation and infrastructure tooling such as Go and Python.
Kubernetes administration experience, including installation, configuration, and troubleshooting.
Working knowledge of networking and Linux-based systems, including container runtimes such as Docker and containerd, packet-level debugging, and kernel troubleshooting.
Experience with Infrastructure as Code (IaC) and configuration management tools, including Terraform, Chef, and Pulumi to ensure scalable and repeatable infrastructure provisioning.
Creative problem-solving skills and excellent communication.
#LI-SP1
We’re a publicly traded (NASDAQ: DKNG) technology company headquartered in Boston. As a regulated gaming company, you may be required to obtain a gaming license issued by the appropriate state agency as a condition of employment. Don’t worry, we’ll guide you through the process if this is relevant to your role.