As an ALEF Big Data Engineer you will enable our software development teams to develop quality services, through architecting and building highly-available and resilient big data platforms, you will be helping teams to rapidly prototype, deliver, and run, high-impact and high-value services for ALEF.
Ideally you will be someone who as well as having a solutions architect background, also has a good grounding and some hands on experience in setting up Big Data Platforms from a Devops perspective.
Job Specific Responsibilities
Define, Design and develop services and solutions around large data ingestion, storage and management such as with RDBMS, No SQL DBs, Log Files, Events.
Define, Design and run robust data pipelines/batch jobs in a production environment
Architecting highly scalable, highly concurrent and low latency systems
Work with third-party and other internal providers service to support a variety of integrations.
Working with product teams on a range of tools and services, improving products to meet user needs.
Participating in sprint planning to work with developers and project teams to ensure projects are deployable and monitorable from the outside.
As part of the team you may be expected to participate some of the 2nd line in-house support and Out-of-Hours support rotas.
Proactively advise on best practices.
Assist in budgeting process.
Education, Experience and Required Skills
Degree in Computer Science, Software Engineering or related preferred
Minimum 10 years
Experience of big data environments (also advising best practices/new technologies to Analytics team)
Experience of Storing Data in systems such as Hadoop HFDS, S3, Kafka.
Able to understand Cluster administration, troubleshooting issues
Experience as Solution Architect for large scale Analytics, Insight, Decisioning and Reporting solutions, based on Big Data technology
Experience of designing, setting up and running big data tech stacks such as Hadoop, Spark and distributed datastores such as Cassandra, DocumentDBs, MongoDB, Kafka.
Understanding High Availability components using Cloudera manager/Horton Works Ambari/MapR Control Systems for Hive/Hue/Yarn/NameNodes, Hue.
Experience configuring and managing Linux servers for serving a dynamic website.
In depth knowledge of Hadoop technology ecosystem - HDFS, Spark, Impala, Hbase, Kafka, Flume, Sqoop, Oozie, SPARK, Avro, Parquet
Experience debugging a complex multi-server service.
In depth knowledge and experience in IaaS/PaaS solutions (eg AWS Infrastructure hosting and managed services)
Scripting or basic programming skills.
Familiarity with network protocols - TCP/IP, HTTP, SSL, etc.
Deploying and configuring machines in a cloud environment.
Understanding continuous integration and delivery.
Experience working in an agile environment.
Knowledge of the use of version control systems such as git or subversion.
Knowledge of AWS Big Data/Analytics services - S3, EMR, Glue, Redshift, QuickSight, Kinesis.
Experience of designing Batch Processing with Spark and Stream Processing with either Spark Streaming or Samza.
Understanding and experience of Search Data applications/platforms such as ElasticSearch.
Understanding of techniques for management of encryption keys and certificates.
Knowledge of the principles underlying public/private key encryption schemes.
Installation and management of open source monitoring tools.
Experience with open source solutions and community.
Understanding and experience of implementing 12 Factor apps.