Job Title: Data Engineer
We are seeking a skilled Data Engineer to design and implement robust data pipelines and storage solutions.
Key Responsibilities:
* Data Pipeline Development: Build and optimize data pipelines with Apache Spark (Python and/or Scala) to process large-scale batch and streaming datasets.
* Data Processing: Work with REST APIs to retrieve and integrate external data, ensuring seamless data flow across systems.
* Collaboration: Collaborate with data scientists and engineers in Agile teams to drive innovation and efficiency in data processing.
* Quality Assurance: Ensure data quality, testing, and monitoring to guarantee high-quality data products.
* Automation: Contribute to CI/CD and automation best practices to streamline data processing and integration.
Requirements:
* Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
* 2 to 5 years of experience as a Data Engineer in Big Data environments.
* Strong skills in Apache Spark (Python and/or Scala), SQL, and data integration.
* Comfortable with Git, Airflow, and CI/CD pipelines.
* Experience with REST APIs and object storage (S3/MinIO).
* Awareness of data governance topics: data lineage, metadata, PII, data contracts…
* Fluent in French and English (minimum B2 level).
* Proactive, detail-oriented, and a strong communicator.