JobbSafariLediga jobbSite Reliability Engineer

Site Reliability Engineer

Epidemic Sound AB

Sammanfattning

Epidemic Sound is seeking a Site Reliability Engineer to join their central platform team in Stockholm. This role focuses on building and maintaining a reliable, scalable, and secure platform for engineering teams. Responsibilities include managing GKE clusters, CI/CD processes, networking, and observability, while promoting self-service capabilities for product teams. The company values innovation and collaboration, offering a dynamic work environment across multiple global locations.

Visa hela jobbannonsen

Jobbet i korthet

Arbetstid

heltid

Förmåner

Opportunity to work in a global and innovative environment.Collaborative culture that values diversity and inclusion.Chance to influence the sound of streaming and content globally.

Stockholm

Ansök senast: 2650-08-06

Publicerad: 2026-06-27

Beskrivning

Join our global force of 400+ innovators, blending the latest in tech with the greatest in soundtracking, from our Stockholm HQ to offices in London, New York, Los Angeles, Berlin, Paris, Oslo, and Seoul. We're an industry leader with a startup mentality. We take what we do seriously, but we don't take ourselves too seriously. Creating and collaborating to transform the sound of streaming, content, and culture. Come join us, and let the world feel your work

As a Site Reliability Engineer at Epidemic Sound, you will be a core member of the central platform team that builds and operates the platform the rest of Engineering ships on - keeping it reliable, scalable, and secure is what this team exists to do. This is infrastructure-flavoured software engineering: you will write the code that defines and automates the platform, and treat it as a product whose customers are the rest of Engineering. The goal is to make the reliable way the easy way - self-service paths that let product teams build and ship safely without waiting for anyone.

Your key responsibilities include

Build and operate the platform our services run on - GKE clusters, the controllers that extend them, and the Terraform that defines our cloud.
Own the path from commit to production - CI/CD, GitOps, and the progressive-delivery patterns that turn a merge into a safe release.
Strengthen the networking and routing layer - traffic management on top of the VPC, firewalls, and network policies that keep it safe and predictable.
Govern access and guardrails - IAM across every layer, policy-as-code, and break-glass paths - so teams move fast within safe defaults rather than waiting on tickets.
Grow reliability and observability - alert hygiene, runbooks, SLOs, and the metrics and tracing that show how the platform behaves in production.
Enable product teams and raise the bar - make production readiness the default, and drive healthy adoption of the standards and docs you would rather share than gatekeep.

Requirements

Kubernetes fundamentals: a solid grasp of controllers, core components, and CNI and networking - depth in the domain matters more than any single tool (GKE a plus).
Infrastructure as code and delivery: Terraform, Helm or Kustomize, CI/CD and GitOps (ArgoCD), and the traffic-management and progressive-delivery mechanisms that move releases out safely.
Networking and access: routing fundamentals, the VPC, firewall, and network-policy primitives beneath it, and IAM and access management at different levels.
Operational depth: monitoring fundamentals (a clear view of when to reach for metrics versus tracing, and experience with an open-source observability stack), strong troubleshooting across distributed systems, and solid Unix/Linux.
Agentic development mindset: you use AI agents actively in your own work, knowing where they add leverage and where human judgement is non-negotiable.
Collaboration and judgement: you do your best work on large, cross-cutting projects, communicate openly, and stay opinionated but open to discussion - reaching for the right tool over your own creation.

It would also be music to our ears if you have

Familiarity with GCP and an observability stack with Prometheus, Thanos, and Grafana.
Experience running containerised platforms at scale.
Service mesh experience with Cilium eBPF, Linkerd, or Istio.
Familiarity with platform building blocks like cert-manager, external-secrets, or external-dns.

Equal opportunity employerWe believe that bringing people together from different backgrounds, experiences and perspectives makes for a healthy workplace, a more successful business and a better world. We value diversity and encourage everyone to come and soundtrack the world with us.

ApplicationReady to make the world feel your work? Please apply, in English.

Ansök till tjänsten

Site Reliability Engineer

Denna arbetsplats har annonserats på Compilation Source (Sweden)-tjänsten den 2026-06-27 och publicerades av Compilation Source (Sweden).

Tillbaka till toppen

OM FÖRETAGET

Epidemic Sound AB

Visa alla jobb för Epidemic Sound AB

Hittade du inte vad du letade efter?

Beskriv med dina egna ord vad du söker, precis som om du skulle förklara det för en kompis. Josi hittar jobb som matchar dig på riktigt.

Testa nu

Sök efter fler liknande jobb

Stockholm

Läs också

Arbetsgivarvarumärke

Bortom fina ord: åtta nycklar till ett inkluderande arbetsgivarerbjudande

Mångfald i arbetsgivarkommunikation börjar inte med en formulering på karriärsidan. Den börjar i hur rekryteringen faktiskt fungerar: vem som får syn på jobbet, vem som vågar söka och vilka meriter som räknas. När kandidater blir mer medvetna, regelverken skärps och konkurrensen om rätt kompetens förändras räcker det inte längre att säga att alla är välkomna. Arbetsgivare behöver kunna visa vad det betyder i praktiken.

Lästid 7 min

Rekrytering

Mångfald är en affärskritisk fråga arbetsgivare inte kan ignorera

Det senaste decenniet har andelen kvinnor inom tech inte ökat, siffran ligger stadigt på 30%. Samtidigt blir tekniken en allt större del av det samhälle alla ska leva i. Åsa Johansen, direktör på nätverket Women in Tech, menar att den låga andelen kvinnor inom branschen är en ren affärsrisk.

Lästid 6 min

Personlig utveckling

Att möta unga i början av karriären blev en av Mats största drivkrafter

Mats Karmefjord har varit med och format programmet där nästa generations ledare växer genom verkliga utmaningar. När han möter deltagarna i Bolidens Graduate Program ser han framtiden ta form framför sig. Med lång erfarenhet av att utveckla ledare fick han uppdraget att utveckla ett program där teori möter praktik och deltagarna förbereds för yrkeslivet. – […]

Lästid 5 min

Liknande jobb

Visa alla lediga jobb

FRA

Site Reliability Engineer

Sammanfattning

Jobbet i korthet

Förmåner

Beskrivning

OM FÖRETAGET

Hittade du inte vad du letade efter?

Sök efter fler liknande jobb

Läs också

Bortom fina ord: åtta nycklar till ett inkluderande arbetsgivarerbjudande

Mångfald är en affärskritisk fråga arbetsgivare inte kan ignorera

Att möta unga i början av karriären blev en av Mats största drivkrafter

Liknande jobb

Platform Engineer inom Kubernetes

Data Engineer

Systemutvecklare

Engineer of Record / Senior Dam Safety Engineer

DevOps Engineer Kubernetes

Data Engineer till Saint-Gobain Distribution Sweden

Senior Platform Engineer

Fastighetsingenjör/ Fastighetschef till VR City Traffic