ABOUT THE ROLE
Our Data Engineer is responsible for implementing data ingestion, continuous integration, monitoring & orchestration on cloud. Needs to assist the team in the successful execution, performance optimization of the cloud data warehouse & cost estimation of serverless cloud components.
Other key responsibilities include:
• Design, construct, install and maintain data management systems using Spark/PySpark, AWS Glue, Dataflow, or similar cloud ETL Tools
• Handle vast amounts structured, semi-structured data on cloud
• Executing Data orchestration, Workflows & ETL Scheduling Tools like Apache Airflow, luigi & step functions.
• Well versed in one of the scripting languages: Python (preferably), Scala or Java, Bash/Shell Scripting and SQL.
• Recommend different ways to constantly improve data reliability and quality.
• Employ an array of technological languages and tools to connect systems together
• Recommend different ways to constantly improve data reliability and quality
• Should be able to clearly communicate results & ideas within the team
• Communicate effectively to all levels of the organization.
• Help in translating the ETL processes from SQL Warehouse/SAP ETL to Cloud Standard and other ETL Tools
The ideal candidate will have outstanding communication skills, proven data infrastructure design and implementation capabilities, strong business acumen, and an innate drive to deliver results. He/she will be a self-starter, comfortable with ambiguity and will enjoy working in a fast-paced dynamic environment.
Other qualifications include:
• University degree in Computer Science Engineering or Statistics
• 3 years of experience in building back-end applications and 2 years in large-scale data/software systems with high performance, scalability and availability parameters
• Thorough understanding of the security features, access layers and data monitoring components on AWS or GCP
• Must have a clear understanding of data unification, centralization, and data lakes on cloud
• Good understanding and know-how of using microservices
• Decent exposure to Cloud data warehouses like BigQuery, Redshift or Snowflake & proficient with Cloud Standard SQL
• Good working knowledge in building Data Lakes, Data warehouses, Architecture, & cloud infrastructure components
• Work closely with the data & tech team in designing and developing the solutions for the use cases identified
• Proven experience tracking analytics and exceeding sales KPIS
• Strong written and verbal communication skills
• Strong organizational skills and attention to detail
• Ability to manage multiple tasks and priorities at once
• Ability to perform well in a fast-paced, cross-functional, dynamic, and ambiguous start-up environment
• Ability to foresee issues with data-storage, partitions and queries and pre-empt them keeping in mind code-reusability and redundancy