Role Overview:
We are seeking a highly skilled Senior Data Engineer with extensive experience in building robust data pipelines, data warehousing, and ETL processes. The ideal candidate will have hands-on experience in moving data from MongoDB to Redshift and expertise in using analytics and BI tools like Metabase and Superset. As a Senior Data Engineer, you will play a crucial role in designing, developing, and maintaining the data infrastructure that powers our business intelligence and analytics initiatives, helping us drive data-driven decision-making across the organization.
Key Responsibilities:
• Data Pipeline Development: Design, develop, and maintain scalable and efficient data pipelines to automate the extraction, transformation, and loading (ETL) of data from various sources, particularly from MongoDB to Redshift.
• Data Warehousing: Architect and manage our data warehouse in Redshift, ensuring optimal performance, scalability, and data integrity. Implement best practices for data modeling, indexing, and partitioning to support complex queries and analytics.
• ETL Process Management: Build and maintain ETL processes to ensure accurate and timely data flow from source systems (MongoDB) to the data warehouse (Redshift). Optimize ETL jobs for performance and cost-effectiveness.
• Data Quality and Governance: Implement data quality checks, monitoring, and data governance practices to ensure the accuracy, completeness, and reliability of the data in the warehouse.
• Analytics and Reporting Integration: Collaborate with data analysts and business intelligence teams to integrate data into analytics and reporting tools like Metabase and Superset. Support the creation of dashboards, reports, and visualizations to enable data-driven decision-making.
• Performance Optimization: Continuously monitor and optimize the performance of data pipelines and the data warehouse, including query performance, data storage, and retrieval times.
• Collaboration and Mentorship: Work closely with cross-functional teams, including data scientists, analysts, and software engineers, to understand data needs and deliver solutions. Provide mentorship and technical guidance to junior data engineers.
• Data Security and Compliance: Ensure that data infrastructure and processes comply with security standards and regulations. Implement encryption, access controls, and data masking as necessary to protect sensitive data.
Skills and Qualifications:
• Extensive Data Engineering Experience:
• Data Pipelines: 5+ years of experience in designing and building data pipelines using technologies such as Apache Airflow, AWS Glue, or similar tools. Proficiency in automating data workflows and managing dependencies.
• ETL Expertise: Strong experience in ETL processes, including data extraction from MongoDB, data transformation, and loading into Redshift. Familiarity with data integration tools like Talend, Informatica, or Fivetran.
• Data Warehousing: In-depth experience with Amazon Redshift, including schema design, performance tuning, and maintenance. Knowledge of other data warehousing solutions like Snowflake or BigQuery is a plus.
• Database Management: Solid understanding of both NoSQL (MongoDB) and SQL databases. Experience in data migration, data modeling, and query optimization.
• Programming Skills: Proficiency in Python and SQL for data manipulation, automation, and query writing. Familiarity with scripting languages like Bash or shell scripting for task automation.
• Analytics and BI Tools: Hands-on experience with data visualization and BI tools such as Metabase, Superset, Tableau, or Looker. Ability to create insightful dashboards and reports that drive business decisions.
• Cloud Infrastructure: Experience with cloud platforms like AWS, including services like S3, Lambda, and Redshift. Knowledge of cloud-based data storage and processing frameworks.
• Data Quality and Governance: Expertise in implementing data quality checks, validation, and data governance best practices. Experience with data cataloging tools and metadata management is a plus.
• Data Security: Understanding of data security practices, including encryption, access control, and compliance with data protection regulations like GDPR or HIPAA.
• Agile Methodologies: Experience working in Agile environments, participating in sprint planning, and delivering work in iterative cycles.
• Soft Skills: Strong problem-solving abilities, excellent communication skills, and the ability to work effectively in a collaborative, fast-paced environment. Capable of mentoring junior engineers and providing technical leadership within the team.
Preferred Qualifications:
• Experience with Real-Time Data Processing: Experience with real-time data streaming and processing technologies such as Apache Kafka, Kinesis, or Spark Streaming.
• Advanced Data Modeling: Expertise in advanced data modeling techniques, including star schema, snowflake schema, and data vault modeling.
• Experience with DataOps: Familiarity with DataOps practices for continuous integration and deployment of data pipelines, monitoring, and automated testing.
• Experience in Automotive or Related Industries: Prior experience working in the automotive or related industries is a plus, but not required.



