§ Degree educated with a minimum of 3 years of direct experience, 5 years overall industry experience
§ Minimum 3 years of direct experience in the ADministration of Apache Hadoop framework: Spark, HBase, HDFS, Hive, Parquet, Sentry, Impala and Sqoop, data warehouse and Informtica ideally in financial services.
§ Effectively maintaining the data pipeline architecture that accounts for security, scalability, maintainability, and performance.
§ Deploying a hadoop cluster, maintaining a hadoop cluster, adding and removing nodes using cluster monitoring tools like Ganglia Nagios or Cloudera Manager, configuring the NameNode high availability and keeping a track of all the running hadoop jobs.
§ Hadoop Administration skills: Cloudera Manager and Cloudera Navigator and HUE.
§ Strong Unix/Red Hat skills, python scripting highly beneficial.
§ Excellent track record of administrating systems.
Knowledge, Skills, and Attributes:
Knowledge and Skills
§ Good knowledge of Big Data platforms, Data Warehouse and Informtica, frameworks, policies, and procedures.
§ Proficient understanding of distributed computing principles
§ Good knowledge of Data Warehouse/RDBMS, Big Data querying tools, such as Pig, Hive, and Impala
§ Good knowledge of Informatica BDM, EDC, EDQ and Axon.
§ Experience with Cloudera, NoSQL databases, such as HBase, Big Data ML toolkits, such as SparkML.
§ SQL knowledge beneficial
§ Experience on Cloud technologies beneficial, like AWS and Azure.
§ A reliable and trustworthy person, able to anticipate and deal with the varying needs and concerns of numerous stakeholders adapting personal style accordingly.
Adaptable and knowledgeable who is able to learn and improve his skills with existing and new technologies.