§ Support the development of the organizational Data Strategy and support data management in line with strategic business objectives, cultures and values.
§ Supports the DBA/Data Engineering/Informatica Operations in production and oversight of the big data platforms following standards, practices, policies and processes.
§ Promotes good administration practices and the management of data as a strategic asset.
§ Hadoop/informatica administrator is responsible for small developments, fixes, testing and maintaining architectures, such as NOSQL databases and Hadoop/Spark processing systems.
§ Oversee Data acquisition, develop data set processes.
§ Resource and security management.
§ Troubleshooting application errors and ensuring that they do not occur again.
FRAMEWORKS, BOUNDARIES, & DECISION MAKING AUTHORITY:
§ Providing technical support and monitor Big Data/Hadoop applications, handle and identify possible production failure scenarios. (Incident Management), respond to end users of Hadoop platform on data issues, report and monitor daily SLA that identifies vulnerabilities and opportunities for improvement.
COMMUNICATIONS & WORKING RELATIONSHIPS:
§ The role requires establishing strong working relationship with all business groups and support functions of the organization at various levels to ensure the stability and performance of the Big Data platform.
§ The role requires regular interactions with vendors, data partners and consulting partners.
Key performance indicators (kpis):
§ Delivery against SLAs and planned BD roadmap activities
§ # Outstanding deliverables in the platform following quality and timelines
§ Increase in stakeholder opinion of Big Data/DW/Informatica
§ % uptime in BD/DW/Informatica utilization and use cases.
§ Feedback from peers, management and key stakeholders
Knowledge, sKILLS, & EXPERIENCE:
§ Degree educated with a minimum of 3 years of direct experience, 5 years overall industry experience
§ Minimum 3 years of direct experience in the ADministration of Apache Hadoop framework: Spark, HBase, HDFS, Hive, Parquet, Sentry, Impala and Sqoop, data warehouse and Informtica ideally in financial services.
§ Effectively maintaining the data pipeline architecture that accounts for security, scalability, maintainability, and performance.
§ Deploying a hadoop cluster, maintaining a hadoop cluster, adding and removing nodes using cluster monitoring tools like Ganglia Nagios or Cloudera Manager, configuring the NameNode high availability and keeping a track of all the running hadoop jobs.
§ Hadoop Administration skills: Cloudera Manager and Cloudera Navigator and HUE.
§ Strong Unix/Red Hat skills, python scripting highly beneficial.
§ Excellent track record of administrating systems.
Knowledge, Skills, and Attributes:
Knowledge and Skills
§ Good knowledge of Big Data platforms, Data Warehouse and Informtica, frameworks, policies, and procedures.
§ Proficient understanding of distributed computing principles
§ Good knowledge of Data Warehouse/RDBMS, Big Data querying tools, such as Pig, Hive, and Impala
§ Good knowledge of Informatica BDM, EDC, EDQ and Axon.
§ Experience with Cloudera, NoSQL databases, such as HBase, Big Data ML toolkits, such as SparkML.
§ SQL knowledge beneficial
§ Experience on Cloud technologies beneficial, like AWS and Azure.
§ A reliable and trustworthy person, able to anticipate and deal with the varying needs and concerns of numerous stakeholders adapting personal style accordingly.
§ Adaptable and knowledgeable who is able to learn and improve his skills with existing and new technologies.