לצערנו המשרה כבר לא בתוקף

Site Reliability Manager

3-4 שנים |
משרה מלאה ועוד
| לפני 14 דקות
תיאור משרה

NeoGames, a world-leading provider of solutions and services for iLottery and iGaming operators, is looking for an SRE Team Leader to join our dynamic Technical Operations Group.
The SRE team leader will be responsible for defining and developing operating procedures for a 24x7 NOC and building a team of analysts that can support the operational duties of the NOC/SOC/SRE.
Supervising and coordinating the team's activities during shifts to deliver outstanding quality and service to our customers in compliance with SLA requirements.

Responsibilities:
• Day-to-day management of the team
• Monitor and support the company's production environments ((i.e., system uptime, performance, customer accessibility, security)
• Manage production events/incidents end to end
• Recruit and train NOC/SOC/SRE engineers
• Ensure SOC Operations comply with company security policies and cloud security best practices
• Establish, integrate and maintain complex monitoring systems (SCOM, SPLUNK, New Relic, etc.)
• Design and write monitoring solutions and automation for an ever-changing environment
• Design, write and maintain playbooks for managing production events
• Assist in coordinating operations and engineering teams to identify errors and anomalies
• Identify ongoing gaps and potential risks for the company's customers/systems and escalate to relevant parties within and outside the company
• Work in close collaboration with various departments such as R&D, Product, Projects, IT, DBA, etc

דרישות התפקיד

• Experience in leading SRE/IT/NOC team
• Experience in advanced NOC/SOC/SRE/IT and support (Tier 3/4) positions
• 24/7 availability for escalations and handling critical production incidents- a must
• Knowledge with a scripting language, Python, Power Shell, SQL- a must
• Familiarity with network and server infrastructure
• Strong understanding of web and web-related technologies
• Working knowledge with container-based environments (k8s) and with Kafka-Advantage
• Experience with centralized logs solutions (SPLUNK, ELK, etc.)
• Experience with application monitoring solutions (New Relic, Dynatrace, etc.)
• Driven, Motivational, Hands-On, and Responsible
• Proven experience with Linux and Windows-based systems- Advantage