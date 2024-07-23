Site Reliability Engineer (DevOps)
2024-07-23
Company Description
Vattenfall is a European energy company with approximately 21 000 employees. For more than 100 years we have electrified industries, supplied energy to people's homes and modernized our way of living through innovation and cooperation. We now want to enable the fossil freedom that drives society forward. To be able to reach this ambitious goal we are looking for talented individuals who, in addition to their passion for their own role, also have strong team spirit and want to contribute to supporting a meaningful corporate mission.
Job Description
Do you want to commit to a critical challenge and help to shape our future?
We keep important, revenue-critical systems up and running despite hurricanes, bandwidth outages, and configuration errors.
We are looking for a team member who is driven by technical challenges and the urge to always find better and more efficient ways of delivering services to the customer.
What will you do?
In the role as SRE engineer you will combine the skillset of dev teams and operations teams by applying a software engineering approach to IT operations.
To succeed in this role it is important that you are self-driven, exact, analytic, critical and know how to prioritize. You are not only an expert in your field but you also have social competence and understand the business needs. You always challenge yourself and others to improve. You are also open for suggestions from others and can also give your own.
You are a team player and have a hands-on mindset, a strong customer- and problem-solving orientation. You show fast results, and you have demonstrated good communication skills, especially in an international organization. You need to be able to work independently in an international team that meets more virtually than physically.
Your tasks and responsibilities:
Maintaining Applications to Help Operations and Support Teams
As a SRE you are in charge of proactively implementing and maintaining infrastructure and business applications. This can be anything from provisioning servers, updating systems, deploying new software, providing pre-emptive maintenance to monitoring and alerting to code changes in production. A site reliability engineer can be tasked with building a homegrown tool from scratch to help with weaknesses in software delivery or incident management.
You must always be looking to improve quality and efficiency and cut costs.
Fixing Support Escalation Issues
Similarly to the point above, a site reliability engineer can be expected to spend time fixing support escalation cases.
Optimizing On-Call Rotations and Processes
More times than not, site reliability engineers will need to take on-call responsibilities and improve system reliability through the optimization of on-call processes. You will help add automation and context to alerts - leading to better real-time collaborative response from on-call responders. Additionally, you will update runbooks, tools and documentation to help prepare your on-call team members for future incidents.
Documenting "Tribal" Knowledge
As you gain exposure to systems in both staging and production, as well as all technical teams, you will take part in work with software development, support, IT operations and on-call duties - meaning you will build up a great amount of historical knowledge over time. Instead of keeping this knowledge in the mind of one team or one person, as SRE you are expected to document much of what you know. Constant upkeep of documentation and runbooks can ensure teams get the information they need right when they need it.
Conducting Post-Incident Reviews
Without thorough post-incident reviews, you have no way to identify what's working and what's not. As such you will be participating in post-incident reviews, documenting your findings and taking action on your learnings.
Location
Katowice, Gliwice, Amsterdam or Solna
Qualifications
Who are you?
As our Site Reliability Engineer you:
Are an expert in your field, strong technical and analytical skills, can solve impossible problems.
Have the ability to translate complicated technical matters into simple understandable language.
Have experience in building and operating of server virtualization environments.
Have thorough knowledge of operating systems (Linux/Windows 2019 and up).
Work with Agile methodology.
Have experience in troubleshooting and root cause analysis.
Are able to deal with tight deadlines when they arise.
Have automation experience with at least one configuration/deployment management system (Ansible, Jenkins pipelines, Azure DevOps...).
Have experience with at least one of the following scripting languages (Powershell, Shell, Python...).
Have experience working with GIT.
Have experience developing CI/CD.
Next to this you bring:
Experience in a large international environment is desirable.
Experience with containers is a plus.
Additional Information
Our offer
Good remuneration, a challenging and international work environment, and the possibility to work with some of the best in the field. You will be working in interdisciplinary teams and you can always count on support from committed colleagues. We offer attractive employment conditions and opportunities for personal and professional development.
More Information
We welcome your application in English. We kindly request that you do not send applications by any means other than via our website as we cannot guarantee that we will be able to process applications that are not made via our website.
For more information about the recruitment process you are welcome to contact our recruiter Ewa Krajewska via ewa.krajewska@vattenfall.com
