Senior Data Platform/Migration Engineer
Summary:
We are looking for a Senior Data Platform / Migration Engineer to lead the modernization of an enterprise data ecosystem, including migration from Cloudera DataIQ DSS to MapR. This role requires deep expertise in large-scale distributed data systems, migration strategy, and performance optimization, with a strong focus on zero data loss, minimal downtime, and production stability.
KEY RESPONSIBILITIES
Lead end-to-end migration of enterprise data lake from Cloudera (DataIQ, DSS, CDP) to MapR
Define and execute migration strategy ensuring data integrity, minimal downtime, and rollback readiness
Design and build scalable, production-grade data pipelines post-migration
Optimize cluster performance including compute, storage, and resource utilization
Partner with BI/reporting teams to ensure schema consistency and data availability
Implement data validation frameworks to ensure accuracy and completeness post-migration
Document architecture, runbooks, lineage, and operational procedures
Collaborate with governance teams on data quality, lineage, and compliance requirements
Requirements
8+ years in Data Engineering / Data Platform Engineering
Strong hands-on experience with Cloudera (CDP, DSS, DataIQ) and/or MapR
Strong hands-on experience with Apache Spark, Hive, Hadoop, HDFS
Proven experience executing large-scale data lake migrations
Strong programming skills in Python, Scala, or SQL
Deep understanding of distributed data processing and storage systems
Experience with ETL/ELT frameworks (Informatica, Talend, dbt, or similar)
PREFERRED QUALIFICATIONS
Prior MapR implementation or certification
Experience with streaming platforms (Kafka, Pulsar)
Exposure to cloud-native data platforms (AWS S3, Azure Data Lake, GCP)
Familiarity with data governance, lineage, and catalog tools
Experience working in high-scale enterprise environments (multi-terabyte/petabyte)
CORE TECHNOLOGY STACK
Cloudera DSS / DataIQ / CDP, MapR, Apache Spark, Hive, Hadoop, HDFS, Kafka, Python, SQL, dbt, Informatica / Talend
WHAT SUCCESS LOOKS LIKE
Seamless migration with zero data loss and minimal business disruption
Improved data pipeline performance and scalability
Optimized infrastructure leading to reduced operational cost
High-quality, governed data ready for analytics and AI use cases