Home » Big Data Engineer

Big Data Engineer

Date Posted —

Type of Work:
Full Time
Salary:
TBD
Hours per Week:
40

Job Description

Position Overview:
We are seeking a talented Big Data Engineer with advanced skills in configuring, developing, customizing, and deploying Big Data stack frameworks based on Apache Ambari, Hadoop, HBase, and related Apache components (KAFKA, HDFS/Ranger, Yarn, Hive, Atlas, Infra, Hue, Spark, Kerberos, KeyCloak, etc.) The ideal candidate will be instrumental in designing and implementing scalable and efficient Big Data solutions to meet our organization’s growing data processing and analytics needs.

***Key Responsibility:

1. Big Data Stack Configuration:
Lead the configuration and setup of Big Data stack frameworks, including Apache Ambari, Hadoop, HBase, and associated components, ensuring optimal performance and resource utilization.

2. Development and Customization:
Collaborate with data scientists, analysts, and developers to customize and extend Big Data platforms to support diverse data processing, analytics, and visualization requirements.

3. Deployment and Maintenance:
Manage the deployment process of Big Data solutions across development, testing, and production environments, implementing best practices for reliability, scalability, integrating, configuring and tailoring new opensource components ( eg. KAFKA, HDFS/Ranger, Yarn, Hive, Atlas, Infra, Hue, Spark, Kerberos, KeyCloak, etc.)

4. Performance Optimization:
Identify performance bottlenecks and optimize Big Data workflows, algorithms, and infrastructure configurations to enhance data processing speed and efficiency.

5. Data Pipeline Development:
Design and implement robust data ingestion, transformation, and storage pipelines, integrating disparate data sources and formats into unified data lakes or warehouses.

6. Data Security and Governance:
Implement security controls, access policies, and data governance mechanisms to ensure the confidentiality, integrity, and availability of sensitive data assets.

7. Monitoring and Troubleshooting:
Establish monitoring and alerting mechanisms to track system health, performance metrics, and data quality issues, and promptly address any operational or technical challenges.

8. Documentation and Knowledge Sharing:
Create comprehensive documentation, runbooks, and training materials to facilitate knowledge transfer and empower team members to leverage Big Data technologies effectively.

***Qualifications:

1. Bachelor’s degree in Computer Science, Engineering, or a related field; advanced degree preferred.
2. Proven experience as a Big Data Engineer, with a focus on configuring, developing, and deploying Big Data stack frameworks in production environments.
3. In-depth knowledge of Apache Hadoop ecosystem components, including HDFS, MapReduce, YARN, Hive, Spark, HBase, Kafka, etc.
4. Proficiency in Apache Ambari for cluster management, monitoring, and administration.
5. Strong programming skills in languages such as Java, Scala, Python, or SQL, with experience in developing and optimizing Big Data applications and algorithms.
6. Familiarity with containerization and orchestration technologies (e.g., Docker, Kubernetes) for deploying and managing Big Data workloads.
7. Experience with cloud platforms (OPENSTACK) and Big Data services (e.g., EMR, HDInsight, Dataproc)
8. Excellent problem-solving abilities, analytical skills, and attention to detail.
communication and collaboration skills, with the ability to work closely with cross-functional teams and stakeholders.

APPLY FOR THIS JOB:

Company: Pomelo
Name: Kortcovein Bayadog
Email:

Skills