As a Data Engineer, you will design and maintain scalable data pipelines, optimize workflows, and ensure data quality using tools like Databricks, Python, and SQL.
Analytical Wizards is part of the Definitive Healthcare family. We balance innovation with an open, friendly culture and the backing of a long-established parent company, known for its ethical reputation. We guide customers from what's now to what's next by unlocking the value of their data and applications to solve their challenges, achieving outcomes that benefit both business and society. Our people are our biggest asset, they drive our innovation advantage and we strive to offer a flexible and collaborative workplace where they can thrive. We offer industry-leading benefits packages to promote a creative and inclusive culture. If driving real change gives you a sense of pride and you are passionate about powering social good, we'd love to hear from you.
Job Description - Data Engineer
About the Role
We are looking for a candidate who is passionate about building scalable data pipelines, optimizing data workflows, and ensuring high data quality across systems. Candidates should demonstrate strong technical foundations, the ability to work independently, and a willingness to collaborate in a dynamic environment.
Core Responsibilities
Core Technical Requirements
Python Programming Databricks Orchestration Tools SQL Skills Cloud Storage (Amazon S3)
Experience Level
2-5 years in Data Engineering or related roles..
Preferred Personal Attributes
Job Description - Data Engineer
About the Role
We are looking for a candidate who is passionate about building scalable data pipelines, optimizing data workflows, and ensuring high data quality across systems. Candidates should demonstrate strong technical foundations, the ability to work independently, and a willingness to collaborate in a dynamic environment.
Core Responsibilities
- Design, develop, and maintain scalable ETL/ELT pipelines to support business and analytical needs.
- Work extensively with Databricks, Python, and PySpark to process large datasets.
- Build and manage DAGs using Apache Airflow for workflow orchestration.
- Collaborate with cross-functional teams to understand data requirements and translate them into efficient engineering solutions.
- Develop and optimize complex SQL queries and participate in data modeling activities for relational and cloud data warehouses.
- Work with Amazon S3 for data storage, ingestion, partitioning, and integration within broader data lake and pipeline ecosystems.
- Ensure high standards of data quality, reliability, and performance across all data processes.
- Contribute to documentation, best practices, and continuous improvement initiatives.
Core Technical Requirements
- Strong experience writing clean, efficient Python code for data manipulation, automation, scripting, and ETL workflows.
- Familiarity with widely used data libraries (e.g., pandas, numpy).
- Hands-on experience with Databricks for distributed data processing.
- Proficiency in PySpark, Delta Lake, notebooks, and building scalable pipelines.
- Apache Airflow (Required): Ability to design, implement, and maintain complex DAGs for scheduling and orchestrating workflows.
- Argo Workflows (Preferred): Experience with Kubernetes-native orchestration platforms is an added advantage.
- Advanced SQL expertise including writing complex queries, query optimization, and working with relational/cloud data warehouses.
- Experience in data modeling and performance tuning.
- Practical knowledge of S3 for ingestion, storage, data partitioning, access control, and integration as part of data lake architectures.
Experience Level
2-5 years in Data Engineering or related roles..
Preferred Personal Attributes
- Strong analytical and problem-solving skills.
- Excellent communication and collaboration abilities.
- Ability to work in a fast-paced, evolving environment.
Top Skills
Amazon S3
Apache Airflow
Databricks
Elt
ETL
Numpy
Pandas
Pyspark
Python
SQL
Definitive Healthcare Bengaluru, Karnataka, IND Office
Bengaluru, India, 560068
Similar Jobs at Definitive Healthcare
Big Data • Healthtech • Software
Lead the development of scalable backend systems using Python, mentor junior developers, and collaborate across teams to deliver integrated analytics solutions.
Top Skills:
DjangoDockerFastapiGitJIRAKubernetesPostgresPythonRedisSqlalchemyWebsockets
Big Data • Healthtech • Software
The Analyst will develop and maintain the marketing analytics platform, using Python and R, while collaborating closely with product leads to enhance user experience and troubleshoot issues.
Top Skills:
PythonR
Big Data • Healthtech • Software
This role involves leading data projects, performing advanced analytics on healthcare datasets, developing data solutions, and mentoring junior analysts.
Top Skills:
Generative AiExcelNumpyPandasPythonScikit-LearnSQL
What you need to know about the Bengaluru Tech Scene
Dubbed the "Silicon Valley of India," Bengaluru has emerged as the nation's leading hub for information technology and a go-to destination for startups. Home to tech giants like ISRO, Infosys, Wipro and HAL, the city attracts and cultivates a rich pool of tech talent, supported by numerous educational and research institutions including the Indian Institute of Science, Bangalore Institute of Technology, and the International Institute of Information Technology.


.jpg)