At Leapfrog, our mission is to be a role model technology company. We want to be trusted partners, world-class engineers, and creative innovators for our clients. We have built awesome software that puts data into action. Our team specializes in data-driven digital products that help businesses transform and capture new markets.
We are looking for a Hadoop Developer to join our dream team. This position offers you an excellent opportunity to work with distributed Big Data infrastructure and systems.
- Conceptualization and implementation of Big data strategy and architecture.
- Take ownership of the development of the distributed Hadoop environment.
- Work closely and transparently with our clients and guide them to the next implementation level and design of their Big Data implementations.
- Designing, building, installing, configuring and supporting Hadoop.
- Definition and set-up of ETL processes and data warehouses to integrate different sources of data.
- Build job workflow based on Business specifications.
- Set-up of standardized monitoring, and reporting tools for the data pipelines and stores.
- Propose best practices/standards.
- Extensive experience with at least a few of the technologies within the Hadoop ecosystem (HDFS, YARN, MapReduce, Hive, Pig, Spark, etc.).
- Extensive knowledge of SQL queries and creating ETL/ELT pipelines in distributed environments.
- Work experience in Flume, Sqoop to import the data from RDBMS.
- Must be comfortable with querying and analyzing large amounts of data on Hadoop HDFS using Hive and Spark.
- Should have work experience in Shell scripting in the Linux.
- Previous programming experience in Java and Python scripting following OOP principles.
- Experience working with VCS such as Git.
- Some experience working with distributed cloud environments like AWS, Azure, GCP, and related technologies.
- Some experience working with Apache Kafka and Zookeeper is a big plus.
- Any professional certification from Cloudera/Hortonworks etc. is a plus.