Role Snapshot
Senior Site Reliability Engineer responsible for building and maintaining system observability, automation, and monitoring solutions to ensure high service availability across PlayOn! Sports' fan engagement platform. This role drives operational excellence through proactive incident prevention, capacity planning, and cross-functional collaboration with engineering teams.
Key Responsibilities: Implement and improve metrics, alerting, and dashboards for system observability; develop automation and tooling solutions; partner with application and quality teams on reliability best practices; participate in on-call rotations for incident response and critical service support.
Skills & Tools: Strong proficiency in Python for automation and tooling, with secondary experience in Java, C++, or Go; deep knowledge of Linux systems, cloud infrastructure (AWS/GCP/Azure), Docker, Kubernetes, and Terraform; hands-on experience with CI/CD pipelines, observability tools (Prometheus, Grafana, Datadog, ELK), and log/metric analysis.
Qualifications: Proven experience implementing SLA/SLO frameworks, facilitating Critical User Journeys, and collaborating across cross-functional teams; demonstrated ability to approach reliability as a shared responsibility with strong problem-solving and communication skills in high-impact situations.
Location: Remote
Compensation: $160K–$220K/yr (estimated)
Job Description
In this role, you can expect to:
Contribute to system observability i.e implementing, improving metrics, alerting, and dashboards for better insight and faster recovery.
Develop automation, tooling, and monitoring solutions to support high service availability.
Partner with application and quality engineering teams to implement best practices in reliability, release automation, and testing.
Drive operational excellence through proactive incident prevention, blameless postmortems, and capacity planning.
Participate in on-call rotations to support critical services and ensure rapid response to incidents.
To thrive in this role, you have:
Solid experience in Python, especially for automation, tooling, and data-driven operational tasks.
Proficiency in at least one (Java, C++, or Go).
Strong understanding of Linux systems, cloud infrastructure (AWS, GCP, or Azure), and modern deployment practices (Docker, Kubernetes, Terraform).
Experience with CI/CD pipelines, version control, and automated testing frameworks.
Experience with observability tools (e.g., Prometheus, Grafana, ELK, Datadog, etc.) and log/metric analysis for diagnosing issues.
Proven experience facilitating and documenting Critical User Journeys translating them to actionable SLA/SLO for automation.
Demonstrated ability to collaborate with cross-functional teams and communicate clearly in high-impact situations.
A problem-solver who approaches reliability as a shared responsibility across engineering.
Familiarity with AI-augmented development tools (Claude, Codex) as part of a modern engineering workflow.
Experience writing or maintaining end-to-end or integration tests for distributed systems.
Background in performance testing, capacity planning, or chaos engineering.
Contributions to internal developer tooling or reliability-focused frameworks.
Exposure to security, compliance, or change management processes in production environments.
Relevant certifications.
Nice to Have
Company Overview
PlayOn is a dynamic growth-stage company dedicated to championing the spirit of play in the high school space. Backed by KKR, our family of brands—including GoFan, NFHS Network, and MaxPreps—empowers schools with innovative solutions and exceptional service. Our fan engagement platform is the only one that offers event ticketing, streaming, fundraising, concessions, merchandise sales, and website management in one place. We save administrators time so they can focus on what truly matters: supporting the students, staff, and fans who bring their programs to life.
Trusted by thousands of schools across the country, we're here to help create more instant replays, hold-your-breath moments, last-minute comebacks, and games you want to watch over and over again.
When being there means everything, we make sure you never miss a moment.
Why you’ll love working at PlayOn
Product, potential, and people. We’re a leader in the high school event space, constantly evolving our product to meet the needs of administrators. We focus on solving real challenges, learning quickly, and creating impactful solutions.
This is a growth-stage company, meaning your contributions have real impact. You’ll have opportunities to grow your skills, tackle meaningful problems, and make a difference in the lives of schools and the students and fans they serve.
Our culture is built on accountability, collaboration, growth, and fairness. We don’t just show up—we show up for each other. Everyone wears the same jersey, and we play hard, make the extra pass, and cheer one another on. Losses teach us, challenges motivate us, and persistence drives us forward. We value integrity over shortcuts, choosing to do what’s right even when it’s hard. Together, we strive to be better every day—because we know that’s how we win as a team.
The Benefits We Offer
Multiple medical insurance plans to choose from
Dental, vision life and disability insurance
Employee Emergency Fund
Company equity (stock options)
Open PTO policy
401K plan with company match
Hybrid/flexible work environment
Note: Must be a full-time employee to participate in the company’s employee health benefit plan. Part-time employees and interns are not eligible to participate.
More Jobs at PlayOn! Sports

Director of Engineering, AI & Computer Vision
PlayOn! Sports·🇺🇸In-Person - Alpharetta, GA
$180K–$260K/yr
🇺🇸In-Person - Alpharetta, GAK-12EngineeringAI DevelopmentMachine LearningSoftware EngineeringComputer VisionEngineering Leadership
10h ago

🇺🇸In-Person - Chamblee, GAK-12SupportCustomer SupportTechnical SupportCustomer SuccessTicket ManagementClient Relations
1w ago
