Senior Site Reliability Engineer (SRE) Salla

صاحب عمل نشط

نشرت في 19 ديسبمر

أرسل لي وظائف مثل هذه

الخبرة

5 - 7 سنوات

موقع العمل

Saudi Arabia - Saudi Arabia

التعليم

أي تخرج()

الجنسية

أي جنسية

جنس

غير مذكور

عدد الشواغر

1 عدد الشواغر

الوصف الوظيفي

الأدوار والمسؤوليات

Reliability & Incident Management

Lead high-severity incident response and drive post-incident reviews.
Troubleshoot complex issues across applications, infrastructure, and networks.
Improve MTTR through better monitoring, alerts, and diagnostic tooling.
Participate in the on-call rotation supporting production systems.

Performance & Scalability br>

Identify and resolve performance bottlenecks and scaling challenges.
Conduct load testing and capacity planning for high-traffic scenarios.

Infrastructure & Operations br>

Enhance cloud-native infrastructure, deployment processes, and automation.
Improve resilience, fault-tolerance, and recovery mechanisms across systems.

Observability br>

Build and refine dashboards, alerts, metrics, logs, and traces.
Define SLIs/SLOs and improve visibility into system behavior.

Tooling & Automation br>

Develop tools that reduce operational toil and increase reliability.
Contribute to infrastructure-as-code, CI/CD pipelines, and GitOps workflows.

Collaboration br>

Work closely with engineering teams to ensure services are robust and production-ready.
Mentor engineers on reliability, debugging, and operational best practices.

Required Skills br>

Strong experience with Kubernetes, service mesh technologies, and cloud platforms (AWS/GCP/Azure).
Deep understanding of Linux, networking, distributed systems, and load balancers.
Hands-on with Terraform or similar IaC tools.
Experience with Prometheus, Grafana, Loki, Mimir, Elastic, or similar observability tools.
Proficiency in scripting/programming (Bash, Python, Go).
Experience with CI/CD and GitOps.
Strong debugging, incident response, and performance analysis skills.

Bonus Skills br>

Background in large-scale, high-traffic systems.
Experience with fault-tolerant design, DR, and HA patterns.
Familiarity with SLOs, SLIs, and error budgets.

الملف الشخصي المطلوب للمرشحين

Required Skills br

Strong experience with Kubernetes, service mesh technologies, and cloud platforms (AWS/GCP/Azure).
Deep understanding of Linux, networking, distributed systems, and load balancers.
Hands-on with Terraform or similar IaC tools.
Experience with Prometheus, Grafana, Loki, Mimir, Elastic, or similar observability tools.
Proficiency in scripting/programming (Bash, Python, Go).
Experience with CI/CD and GitOps.
Strong debugging, incident response, and performance analysis skills.

Bonus Skills br

Background in large-scale, high-traffic systems.
Experience with fault-tolerant design, DR, and HA patterns.
Familiarity with SLOs, SLIs, and error budgets.

القطاع المهني للشركة

إنترنت
التجارة الإلكترونية
دوتكوم

المجال الوظيفي / القسم

سوفت وير تقنية المعلومات

الكلمات الرئيسية

Senior Site Reliability Engineer (SRE)

تنويه: نوكري غلف هو مجرد منصة لجمع الباحثين عن عمل وأصحاب العمل معا. وينصح المتقدمون بالبحث في حسن نية صاحب العمل المحتمل بشكل مستقل. نحن لا نؤيد أي طلبات لدفع الأموال وننصح بشدة ضد تبادل المعلومات الشخصية أو المصرفية ذات الصلة. نوصي أيضا زيارة نصائح أمنية للمزيد من المعلومات. إذا كنت تشك في أي احتيال أو سوء تصرف ، راسلنا عبر البريد الإلكتروني abuse@naukrigulf.com

Salla

https://apply.workable.com/salla/j/7F12B5E837/

وظائف مماثلة

مهندس موثوقيه الموقع

Halian

5 - 8 سنوات
دبي - الإمارات العربية المتحدة

عرض الكل

الصفحة الرئيسية
وظائف في سعودية
وظائف Site Reliability Engineer في سعودية
وظائف Site Reliability Engineer في اخرى
وظائف Site Reliability Engineer

Senior Site Reliability Engineer (SRE) Salla

الأشخاص الذين يبحثون على وظائف Site Reliability Engineer الوظائف التي تم البحث عنها