Looking for a Data Engineer to build scalable data pipelines with PySpark, perform impact analysis, and ensure data quality in collaboration with the Group Operations Team.
We are looking for a highly skilled Data Engineer to join the Data Engineering Chapter supporting the Group Operations Team. The ideal candidate will work closely with business and technical stakeholders to understand data requirements, perform impact analysis, and build scalable data pipelines using modern technologies like PySpark.
Key Responsibilities- Collaborate with the Group Operations Team to gather and analyze data requirements
- Perform impact assessment, technical data mapping, and data profiling
- Design and develop data extraction, transformation, and loading (ETL) pipelines
- Build and optimize data pipelines using PySpark as part of the bank’s modern tech stack
- Develop data solutions aligned with AECB application data models
- Ensure data quality, integrity, and consistency across systems
- Participate in unit testing, deployment, and production support
- Leverage modern AI tools (e.g., Claude) to improve development efficiency and reduce operational errors
- Work in an agile environment and contribute to continuous improvement initiatives
- Strong hands-on experience with PySpark and big data processing
- Expertise in Informatica BDM Development
- Solid understanding of ETL/ELT concepts and data warehousing
- Experience in data mapping, profiling, and impact analysis
- Knowledge of SQL, data modeling, and performance tuning
- Familiarity with banking/financial data systems is a plus
- Exposure to AI-assisted development tools is an added advantage
- Strong problem-solving and analytical skills
- Experience working in banking or financial services domain
- Familiarity with AECB reporting/data standards
- Experience with cloud platforms (AWS/Azure/GCP) is a plus
Similar Jobs
Fintech • Financial Services
Design, build and maintain data pipelines, data warehouses and lakes; implement processing and analysis algorithms; ensure data accuracy, security and performance; collaborate with data scientists to deploy ML models; lead or support teams and embed risk controls and governance.
Top Skills:
AWSHadoopLinux Shell ScriptingPysparkSQLTeradata
Artificial Intelligence • HR Tech • Professional Services • Software
Design, build, and optimize scalable PySpark-based ETL pipelines using the Apache ecosystem. Ensure production reliability, performance tuning, data quality, lineage, and integration of structured/unstructured sources. Collaborate with data scientists, analysts, and stakeholders; write maintainable Python code and support CI/CD and version control workflows.
Top Skills:
Apache NifiCi/CdGitHadoopHdfsHiveJavaNoSQLPysparkPythonSparkSQL
Fintech • Financial Services
Design, build, and lead production data and ML pipelines (streaming and batch) using Python and PySpark. Develop ETL, feature engineering, EDA, data quality, and anomaly detection. Use Hadoop/Hive, Pandas, SQL/NoSQL, Airflow/Jenkins and CI/CD tooling. Write tested, maintainable code, collaborate with stakeholders, and mentor teams while communicating technical concepts to technical and non-technical audiences.
Top Skills:
Apache AirflowGitGitGithub ActionsHadoopHiveJenkinsJupyterMapreduceMatplotlibNosql DbmsPandasPycharmPysparkPythonSeabornSparkSQLVscode
What you need to know about the Bengaluru Tech Scene
Dubbed the "Silicon Valley of India," Bengaluru has emerged as the nation's leading hub for information technology and a go-to destination for startups. Home to tech giants like ISRO, Infosys, Wipro and HAL, the city attracts and cultivates a rich pool of tech talent, supported by numerous educational and research institutions including the Indian Institute of Science, Bangalore Institute of Technology, and the International Institute of Information Technology.


