عربي

الصفحة الرئيسية وظائف في الامارات وظائف في أبوظبي وظائف Senior System Engineer في أبو ظبي مهندس نظام أول

Senior HPC Systems Engineer

Group 42

تم نشره يوم 5 فبراير 2021

2 - 5 سنوات Abu Dhabi - United Arab Emirates

أي تخرج. أي جنسية

سهل التطبيق

عدد الشواغر 01

الوصف الوظيفي

ايميل الوظيفة
تم إرسال البريد الإلكتروني بنجاح.

Job Requirements
We are seeking a Senior HPC Systems Engineer to maintain G42 state-of-the-art computational and data science infrastructure.
As a member of our HPC Team, you will lead and participate in the deployment, management, and optimization of systems, and processes. You will work with G42 s community to identify and provide solutions and technical support that enable our cloud customers to deploy and develop their AI applications at scale.

Responsibilities and Duties:
• Provide tier-3 in-depth technical O&M support and administration of 24*7*365 always available production environment
• Configure, install, maintain and upgrade HPC clusters (compute, storage, and network) and applications in support of research computing environments
• Lead and collaborate on projects to maintain and enhance system functionality in areas such as systems monitoring, scheduling and resource management, configuration management, backups, HPC system management utilities/tools, HPC cluster performance and resiliency
• Diagnose, isolate and resolve complex application and system technical problems (hardware, software, network)
• Develop scripts and automation to enhance operational services and service quality
• Perform system tuning based upon proactive performance analysis
• Build, install, and support scientific software (Commercial and Open Source)
• Develop and maintain technical documentation for customer use and contribute to the internal knowledge base.

Work Experience
• Solid Experience in configuring, managing, and optimizing large Linux clusters and servers
• Expert level experience with management tools (e.g. PBS, SLURM, Moab, TORQUE, etc.)
• Experience configuring, managing, and optimizing distributed and parallel file systems such as Lustre, GPFS, NFS, Ceph and protocols FC, iSCSI, NFS, CIFS, etc.
• Knowledge of networks, routers, switches, firewalls and familiarity with high-performance networks such as Infiniband
• Strong scripting/programming capabilities ( e.g. Python, Bash, Perl)
• Experience managing virtualization platforms (VMWare, KVM, oVirt)
• Extensive knowledge of RedHat or Debian based distributions and strong experience with maintaining, upgrading, and tuning the Linux kernel
• Experience with system configuration management tools such as Puppet, Ansible, Chef, Cobbler
• Experience with monitoring/alerting tools (e.g. Ganglia, Nagios, Zabbix, Grafana)
• Strong experience with compiling and building packages tools (e.g. Spack, Conda, EasyBuild)
• Strong Experience using containerized workflows based on docker, singularity, Kubernetes
• Solid Experience configuring, installing and troubleshooting MPI
• Demonstrated ability to research, quickly identify and correct problems (debug) using system utilities and diagnostics
• Demonstrated ability to perform complex performance analysis including system processes, I/O subsystems, networks and other related components.

Desired skills
• Experience with performance benchmarking using profilers and debuggers to recommend code improvements for scalability and performance
• Experience with Nvidia DGX servers and Nvidia tools
• Experience with Linux kernel development and the Linux development community
• Experience with on-prem cloud technologies such as OpenStack
• Working knowledge of one or more programming languages such as C, C++.


سوفت وير تقنية المعلومات

الكلمات الرئيسية

Senior HPC Systems Engineer

الدخول للتقدّمسجّل وقدّمقدّم بدون تسجيل

ابلاغ عن هذه الوظيفة

تنبيه: نوكري غلف فقط منصّة يجمع بين كل من الباحثين عن عمل و أصحاب العمل. ننصح المتقدمين للوظائف التحقق من شرعية أصحاب العمل المحتملين. نحن لا ندعم أي مطالبات لتحويل الأموال و ننصح بشدة ضد الإفصاح عن اي معلومات شخصية أو مالية.و ننصح أيضا زيارة تحذير أمني للمزيد من المعلومات. إذا تشك في أي غش أو احتيال اتصل بنا على abuse@naukrigulf.com

Group 42


عرض تفاصيل الاتصال

الاتصال

الاسم / التعيين:
-

موقع الكتروني https://careers.g42.ai/ae/en/job/GR42AEP-100087en_AE/Senior-HPC-Systems-Engineer


معلومات إضافية مطلوبة

طلب صاحب العمل بعض المعلومات الإضافية مع طلبك للحصول على هذه الوظيفة

تسجيل الدخول الى نوكري غلف

مواصلة استخدام

ستبقى جميع أنشطتك سرية

أو