Design and develop highly scalable real-time data processing systems using Hadoop ecosystem technologies such as Iceberg, Spark, Ozone, Trino, Hive, Ranger, Kafka, Flink, and NiFi
Build robust batch and real-time data ingestion and transformation frameworks using Java, Spark, Python, and shell scripting
Develop solutions to process and manage multi-model data including images, audio, video, and unstructured documents
Develop full-stack applications and internal engineering tools using Python, shell scripting, and modern web frameworks such as Flask and React
Collaborate closely with data scientists to operationalize and deploy machine learning models using Cloudera Machine Learning (CML)
Perform performance tuning and optimization of Hadoop-based data applications to improve scalability and resource utilization
Design and implement scalable enterprise data processing pipelines and framew...