Position Overview
We are seeking an experienced Data Engineer (PySpark) to design, build, optimize and maintain scalable data pipelines for production environments. This role requires strong hands‑on experience in big data processing pipeline optimization and deployment using modern data engineering tools and frameworks.
Key Responsibilities - Design, develop and maintain robust, scalable data pipelines using Python and PySpark.
- Perform data ingestion, transformation, cleansing and validation across structured and unstructured datasets.
- Conduct exploratory data analysis (EDA) to identify data patterns, anomalies and quality issues.
- Apply data imputation techniques, data linking and cleansing to ensure high data quality.
- Implement feature engineering pipelines to support analytics and downstream use cases.
- Optimize Spark jobs for performance, scalability and cost efficiency.
- Deploy and tune production‑grade data pipelines ...