Site Reliability Engineer - AB Electrolux

Site Reliability Engineers (SREs) are people who use engineering-based approaches to solve operations problems. SRE owns and develops the infrastructure needed for the Electrolux Connectivity Platform and supporting services. SRE is also responsible for making sure the services - both internal and external systems - have the characteristics and qualities needed for the intended use.You will work to understand the operational requirements and develop an infrastructure architecture and tools that meet these requirements. You will monitor the performance of the system and refine the management of the infrastructure from both a performance and cost perspective so that it is optimal and balanced at all times.You will also work closely with our DevOps teams to deliver efficiently by empowering them with excellent tools that you develop. These might be for example monitoring tools, infrastructure pipeline components etc.ResponsibilitiesEngage in and improve the whole lifecycle of services - from inception and design, through deployment, operation and refinementSupport services prior to production through activities like system design consulting, developing software platforms and frameworks, capacity planning and launch reviewsContribute improvements to the availability, scalability, latency, and efficiency of the services once they are liveScale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocityPractice sustainable incident response and blameless postmortemsContribute to our deployment and automation toolsPromote Site Reliability Engineering best practicesBe part of our on-call rotation with other engineers around the worldMinimum QualificationsBS or MS in Computer Science or a related technical field3+ years experience working with infrastructure engineering in large-scale production service environment3+ years experience in analyzing and troubleshooting distributed systems using logging, distributed tracing, stack traces and metricsAutomation skills and a desire to automate everythingComfortable with at least one of the following languages: Java, Python, Go. Can learn a new language quicklySystematic problem-solving approach with strong sense of ownershipGood communication skillsPreferred QualificationsYou are a Software EngineerA good understanding in large-scale distributed systemsExperience working with Public Cloud (AWS, Azure or GCP)Experience working with container orchestration e.g. KubernetesExperience in monitoring and metrics systems, e.g. Prometheus, GrafanaA good knowledge of Site Reliability Engineering principlesExperience with on-call rotation, incident response and blameless postmortemCI/CD automation experienceA great team playerFluency in EnglishÖppen för allaVi fokuserar på din kompetens, inte dina övriga förutsättningar. Vi är öppna för att anpassa rollen eller arbetsplatsen efter dina behov.