Astellas Pharma Logo

Astellas Pharma

Databricks Data Engineer

Posted Yesterday
Be an Early Applicant
In-Office
Bengaluru, Bengaluru Urban, Karnataka
Mid level
In-Office
Bengaluru, Bengaluru Urban, Karnataka
Mid level
Join Astellas' DigitalX team as a DataBricks Developer, leveraging Databricks for data engineering, machine learning, and business insights. Build ETL pipelines, manage ML models, and ensure data reliability and performance while collaborating with cross-functional teams.
The summary above was generated by AI
Job Summary & Responsibilities

As part of the Astellas commitment to delivering value for our patients, our organisation is currently undergoing transformation to achieve this critical goal.  This is an opportunity to work on digital transformation and make a real impact within a company dedicated to improving lives. 

DigitalX our new information technology function is spearheading this value driven transformation across Astellas.  We are looking for people who excel in embracing change, manage technical challenges and have exceptional communication skills. 

We are seeking committed and talented DataBricks Developers with 2- 4 years progressive experience to join our new InformationX team- which lies at the heart of DigitalX. 

The ideal candidate will have a minimum of 2 years of professional experience leveraging the Databricks platform to deliver data engineering, machine learning (ML), and business intelligence outputs. You will be responsible for building and maintaining robust data pipelines, developing scalable ML models, and generating actionable insights from large datasets. This role requires a strong understanding of big data technologies, data architecture, and proficiency in languages such as Python and SQL.

As a member of our team within InformationX, you will be responsible for ensuring our data driven systems are operational, scalable and continue to contain the right data to drive business value. 

 

Responsibilities:

Your responsibilities will include executing complex data projects, ensuring smooth data flows between systems, and maintaining the efficiency and reliability of data platforms. This is a fantastic global opportunity to use your proven agile delivery skills across a diverse range of initiatives, utilise your development skills, and contribute to the continuous improvement/delivery of critical IT (Information Technology) solutions. 

 

  • Data Engineering: Design, develop, and maintain efficient and reliable ETL/ELT pipelines using Databricks notebooks and Delta Lake.
  • Machine Learning: Collaborate with data scientists to deploy and manage scalable ML models, ensuring they are integrated into production workflows.
  • Insight Delivery: Create and optimize notebooks and queries to provide data-driven insights and reports to business stakeholders.
  • Platform Management: Manage Databricks clusters and jobs, ensuring optimal performance, cost-efficiency, and security.
  • Collaboration: Work closely with data scientists, analysts, and business teams to understand requirements and deliver end-to-end data solutions.

You will also be contributing to the following areas:  

  • End-to-End Data Solutions:
    • Design end-to-end scalable data streams, storage, data serving systems, and analytical workflows using Databricks.
    • Define overall architecture, capabilities, platforms, tools, and governing processes.
  • Data Pipeline Development:
    • Build data pipelines to extract, transform, and load data from various sources.
    • Set up metadata and master data structures to support transformation pipelines in Databricks.
  • Data Warehousing and Data Lakes:
    • Create data warehouses and data lakes for efficient data storage and management.
    • Develop and deploy data processing and analytics tools.
  • Collaboration with DataX and other key stakeholder value teams:
  • Collaborate with data modelers to create advanced data structures and models within the Databricks environment.
  • Develop and maintain Python scripts for data processing, transformation, and analysis.
  • Utilize Azure and AWS cloud services (e.g., Azure Data Lake, AWS S3, Redshift) for data storage and processing.
  • Apply expertise in Databricks to enhance data architecture, performance, and reliability. Lead relevant data governance initiatives and ensure compliance with industry standards.
    • Work closely with data scientists to develop and deploy data-driven solutions.
    • Provide technical direction to Data Engineers and perform code reviews.
  • Continuous Learning:
    • Stay up to date on the latest data technologies, trends, and best practices.
    • Participate in smaller focused mission teams to deliver value driven solutions aligned to our global and bold move priority initiatives and beyond.
  • Collaborate with cross functional teams and practises across the organisation including Commercial, Manufacturing, Medical, DataX, GrowthX and support other X (transformation) Hubs and Practices as appropriate, to understand user needs and translate them into technical solutions.
  • Provide Level 3 and 4 Technical Support to internal users troubleshooting complex issues and ensuring system uptime as soon as possible.
  • Champion continuous improvement initiatives identifying opportunities to optimise performance security and maintainability of existing data and platform architecture and other technology investments.
  • Participate in the continuous delivery pipeline Adhering to DevOps best practises for version control automation and deployment. Ensuring effective management of the FoundationX backlog.
  • Leverage your knowledge of Machine Learning (ML) and data engineering principles to integrate with existing data pipelines and explore new possibilities for data utilization.
  • Stay up to date on the latest trends and technologies in full-stack-development, data engineering and cloud platforms.
Preferred Qualifications

Technical Skills:

  • Proficiency in PySpark/ Python or Scala for data manipulation, scripting, and analytics
  • Strong understanding of distributed computing principles
  • Experience using ETL tools like Talend/Talend Cloud and DataStage.
  • Knowledge and experience using Azure DevOps.
  • Experience with cloud services such as Azure Data Lake, AWS S3, and Redshift
  • Experience in working with MPP Databases like AWS Redshift.
  • Experience in integrating data from multiple sources like relational databases, Salesforce, SAP, and API calls.

 

Required Qualifications:

  • Bachelor's or Master's degree in Computer Science, Engineering, or related field.
  • 3 years+ of experience as a Data Engineer or DataBricks Developer.
  • Proficiency in Python for data manipulation, scripting, and analytics.
  • Strong understanding of data modelling concepts and practices.

 

Any relevant cloud-based DataBricks, AWS or Azure certifications for example: 

  • Databricks Data Engineer
  • AWS Certified Data Analytics Speciality – Professional / Associate (will be considered with relevant experience)
  • Microsoft Certified Azure Data Engineer Associate
  • Microsoft Certified Azure Database Administrator
  • Microsoft Certified Azure Developer
  • Experience using ETL tools like Talend / Talend Cloud and DataStage (Essential)
  • Knowledge and experience using Azure DevOps (Essential)
  • Knowledge and experience of working with SalesForce / SAP (Desirable)
  • Experience in working with MPP Databases like AWS Redshift
  • Experience of delivering architectural solutions effectively within Lifesciences or Pharma Domains.

Preferred Qualifications:

  • Experience analysing and building star schema data warehouses
  • Experiencing writing SQL and creating stored procedures is essential
    Data Analysis and Automation Skills: Proficient in identifying, standardizing, and automating critical reporting metrics and modelling tools.
  • Analytical Thinking: Demonstrated ability to lead ad hoc analyses, identify performance gaps, and foster a culture of continuous improvement.
  • Experience in integrating data from multiple Data sources like relational Databases, Salesforce, SAP and API calls.
  • Agile Champion: Adherence to DevOps principles and a proven track record with CI/CD pipelines for continuous delivery.
  • Understand and Interpret business requirements and can term them into technical requirements.
  • Create and maintain technical documentation as part of CI/CD principles

"Beware of recruitment scams impersonating Astellas recruiters or representatives. Authentic communication will only originate from an official Astellas LinkedIn profile or a verified company email address. If you encounter a fake profile or anything suspicious, report it promptly to LinkedIn's support team through LinkedIn Help"

Top Skills

Aws S3
Azure Data Lake
Azure Devops
Databricks
Python
Redshift
SQL
Talend

Similar Jobs

Yesterday
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
Mid level
Mid level
Pharmaceutical
The DataBricks Data Engineer is responsible for building and maintaining data pipelines, developing ML models, and ensuring data-driven systems operate optimally within Astellas' digital transformation efforts.
Top Skills: Aws S3Azure Data LakeAzure DevopsDatabricksDatastagePythonRedshiftSQLTalendTalend Cloud
Yesterday
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
Mid level
Mid level
Pharmaceutical
The role involves developing BI and ETL solutions, data modeling, and integration using tools like Qlik, Tableau, and Azure services. Responsibilities include managing data quality, troubleshooting, and collaborating on data-driven projects.
Top Skills: AWSAzureDatabricksDbtPower BIPythonQlik SenseSQLTableauTalend
Yesterday
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
Mid level
Mid level
Pharmaceutical
As a Databricks Data Engineer, you'll develop and maintain ETL pipelines, build machine learning models, and deliver data solutions to enhance business value. You'll collaborate with teams to ensure data flows smoothly and systems remain operational.
Top Skills: Aws S3Azure Data LakeAzure DevopsDatabricksPythonRedshiftSQLTalend

What you need to know about the Bengaluru Tech Scene

Dubbed the "Silicon Valley of India," Bengaluru has emerged as the nation's leading hub for information technology and a go-to destination for startups. Home to tech giants like ISRO, Infosys, Wipro and HAL, the city attracts and cultivates a rich pool of tech talent, supported by numerous educational and research institutions including the Indian Institute of Science, Bangalore Institute of Technology, and the International Institute of Information Technology.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account