Data Scientist (AI Vector DB Engineer)

Posted 19 Hours Ago
Be an Early Applicant
Bangalore, Bengaluru, Karnataka
Hybrid
7+ Years Experience
Artificial Intelligence • Cloud • Internet of Things • Software • Cybersecurity • Industrial
Cat Digital uses digital technologies to help Caterpillar Inc. customers build a better world.
The Role
Seeking a skilled Data Scientist (AI Vector DB Engineer) to design, implement, and optimize vector databases for high-performance, large-scale data processing and retrieval. Required 7-8 years of relevant experience in vector DB with a total of 10 years experience. Must have a deep understanding and hands-on experience with vector databases and strong programming skills in Python, C++, or Java.
Summary Generated by Built In

Career Area:
Business Technologies, Digital and Data
Job Description:
Your Work Shapes the World at Caterpillar Inc.
When you join Caterpillar, you're joining a global team who cares not just about the work we do - but also about each other. We are the makers, problem solvers, and future world builders who are creating stronger, more sustainable communities. We don't just talk about progress and innovation here - we make it happen, with our customers, where we work and live. Together, we are building a better world, so we can all enjoy living in it.
Your Impact Shapes the World at Caterpillar Inc
When you join Caterpillar, you're joining a global team who cares not just about the work we do - but also about each other. We are the makers, problem solvers and future world builders who are creating stronger, more sustainable communities. We don't just talk about progress and innovation here - we make it happen, with our customers, where we work and live. Together, we are building a better world, so we can all enjoy living in it.
Job Summary
We are seeking a skilled Data Scientist (AI Vector DB Engineer ) to join Applications Development and Support -CAT IT Team.
As a Vector DB Engineer, you will be responsible for designing, implementing, and optimizing vector databases that enable high-performance, large-scale data processing and retrieval. You will work closely with our data science, machine learning, and software engineering teams to build robust solutions that support our clients' data-intensive applications.
The preference for this role is to be based out of Bangalore, Whitefield office & Chennai - Brigade World Trade Center
What you will do

  • Design, implement, and manage vector databases to support large-scale data storage and retrieval, ensuring low latency and high availability.
  • Develop efficient data models that facilitate fast vector operations such as similarity search, nearest neighbor search, and other vector-based queries.
  • Optimize database performance through indexing, partitioning, sharding, and other techniques to handle large-scale datasets.
  • Integrate vector databases with existing systems and applications, ensuring seamless data flow and accessibility.
  • Design and implement solutions that scale with growing data volumes, ensuring the database infrastructure can handle increased load and complexity.
  • Implement security best practices to protect data at rest and in transit, including encryption, access controls, and audit logging.
  • Monitor database performance and troubleshoot issues as they arise, ensuring system reliability and availability.
  • Work closely with data scientists, machine learning engineers, and software developers to understand their needs and provide database solutions that meet their requirements.
  • Maintain comprehensive documentation for database schemas, configurations, and procedures to support operational excellence and knowledge sharing.


What you will have

  • A 4-year bachelor's degree full-time education is required.
  • Required 7-8years of relevant years of experience into vector DB with a 10 years of overall experience .
  • Deep understanding and hands-on experience with vector databases, including their architecture, query languages, and optimization techniques.
  • Strong programming skills in languages such as Python, C++, or Java, with experience in developing and optimizing database operations.
  • Solid understanding of data structures, algorithms, and computational geometry, particularly related to vector search and similarity measures
  • Experience with cloud platforms (e.g., AWS, GCP, Azure) and managed database services.
  • Understanding of machine learning concepts, particularly those related to embedding vectors and similarity searches.
  • Strong problem-solving skills with a focus on performance optimization and scalability.
  • Excellent communication skills, with the ability to articulate complex technical concepts to non-technical stakeholders.


Additional Information:
Shift Time: 01:00-10:00PM IST
Skills desired:
Business Statistics: Knowledge of the statistical tools, processes, and practices to describe business results in measurable scales; ability to use statistical tools and processes to assist in making business decisions.
Level Working Knowledge: • Explains the basic decision process associated with specific statistics. • Works with basic statistical functions on a spreadsheet or a calculator. • Explains reasons for common statistical errors, misinterpretations, and misrepresentations. • Describes characteristics of sample size, normal distributions, and standard deviation. • Generates and interprets basic statistical data.
Accuracy and Attention to Detail : Understanding the necessity and value of accuracy; ability to complete tasks with high levels of precision.
Level Extensive Experience: • Evaluates and makes contributions to best practices. • Processes large quantities of detailed information with high levels of accuracy. • Productively balances speed and accuracy. • Employs techniques for motivating personnel to meet or exceed accuracy goals. • Implements a variety of cross-checking approaches and mechanisms. • Demonstrates expertise in quality assurance tools, techniques, and standards.
Analytical Thinking: Knowledge of techniques and tools that promote effective analysis; ability to determine the root cause of organizational problems and create alternative solutions that resolve these problems.
Level Working Knowledge: • Approaches a situation or problem by defining the problem or issue and determining its significance. • Makes a systematic comparison of two or more alternative solutions. • Uses flow charts, Pareto charts, fish diagrams, etc. to disclose meaningful data patterns. • Identifies the major forces, events and people impacting and impacted by the situation at hand. • Uses logic and intuition to make inferences about the meaning of the data and arrive at conclusions.
What you will get:

  • Work Life Harmony
  • Earned and medical leave.
  • Flexible work arrangements
  • Relocation assistance


Holistic Development

  • Personal and professional development through Caterpillar 's employee resource groups across the globe
  • Career developments opportunities with global prospects


Health and Wellness

  • Medical coverage -Medical, life and personal accident coverage
  • Employee mental wellness assistance program


Financial Wellness

  • Employee investment plan
  • Pay for performance -Annual incentive Bonus plan.


Additional Information:
Caterpillar is not currently hiring individuals for this position who now or in the future require sponsorship for employment visa status; however, as a global company, Caterpillar offers many job opportunities outside of the U.S. which can be found through our employment website at www.caterpillar.com/careers
Caterpillar is an Equal Opportunity Employer (EEO)
EEO/AA Employer. All qualified individuals, including minorities, females, veterans and individuals with disabilities - are encouraged to apply.
Posting Dates:
September 18, 2024 - October 1, 2024
Caterpillar is an Equal Opportunity Employer (EEO).
Not ready to apply? Join our Talent Community .

Top Skills

C++
Java
Python
The Company
100,000 Employees
Hybrid Workplace
Year Founded: 1925

What We Do

Cat Digital is the digital and technology arm of Caterpillar Inc., responsible for bringing digital capabilities to our world-famous yellow iron. With over one million connected assets worldwide, our teams use data, technology, advanced analytics and AI capabilities to help our customers build a better world.

Why Work With Us

The Cat Digital team is at the forefront of Caterpillar’s evolution. We take pride in solving complex problems by building new systems from the ground up. On our team, you’ll leverage data from across our entire enterprise to find solutions that open a new world of possibilities for our customers and dealers. Join us in building a better tomorrow.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery

Caterpillar Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Typical time on-site: Flexible
India

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account