Quantiphi Logo

Quantiphi

Tech Architect - Platform (MLOps)

Sorry, this job was removed at 08:14 p.m. (IST) on Tuesday, Apr 29, 2025
Be an Early Applicant
In-Office
3 Locations
In-Office
3 Locations

Similar Jobs

An Hour Ago
Hybrid
Pune, Maharashtra, IND
Senior level
Senior level
Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting
The Senior Governance & Risk Specialist will manage compliance and operational risks, provide technical guidance, monitor control effectiveness, and support audits and documentation for risk management processes.
Top Skills: CobitData SecurityEu GdprHipaa/HitechIso/Iec 27001:2013Ms Office Productivity SuiteNist CsfNist Sp 800-53Risk Management FrameworksSoc1/Soc2/Soc3
An Hour Ago
Hybrid
Pune, Maharashtra, IND
Entry level
Entry level
Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting
As a Cloud Database Administrator, you will manage and troubleshoot data warehouse databases on AWS, ensuring efficient operation and security. You will support both production and non-production environments, perform core DBA tasks, and monitor database performance.
Top Skills: AWSRedshiftSnowflake
6 Hours Ago
Hybrid
Navi Mumbai, Thane, Maharashtra, IND
Senior level
Senior level
Enterprise Web • Fintech • Financial Services
As a Senior Internal Auditor, you will evaluate business processes and internal controls, conduct audits, and recommend improvements across various departments.
Top Skills: Cobit It Governance FrameworkCoso Internal Control FrameworkIia Global Internal Audit StandardsUs Gaap Accounting Principles

While technology is the heart of our business, a global and diverse culture is the heart of our success. We love our people and we take pride in catering them to a culture built on transparency, diversity, integrity, learning and growth.
If working in an environment that encourages you to innovate and excel, not just in professional but personal life, interests you- you would enjoy your career with Quantiphi!

Role : Technical Architect - Platform (MLOps)

Experience Level : 8 to 12 Years 

Location : Mumbai / Bangalore (Hybrid)

Roles and Responsibilities:

  • Orchestrating LLM Workflows & Development: Design, implement, and scale the underlying platform that supports GenAI workloads, be it for real-time or batch. The workloads can also vary from fine-tuning/distilling to inference. 

  • LLMOps (LLM Operations): Build and manage operational pipelines for training, fine-tuning, and deploying LLMs  such as Llama, Mistral etc, GPT-3/4, BERT, or similar. Ensure smooth integration of these models into production systems.

  • GPU Optimization: Optimize GPU utilization and resource management for AI workloads, ensuring efficient scaling, low latency, and high throughput in model training and inference. Develop techniques to manage multi-GPU systems for high-performance computation. Have clarity on LLM parallelization techniques as well as other inference optimization techniques.

  • Infrastructure Design & Automation: Design, deploy, and automate scalable, secure, and cost-effective infrastructure for training and running AI models. Work with cloud providers (AWS, GCP, Azure) to provision the necessary resources, implement auto-scaling, and manage distributed training environments.

  • Platform Reliability & Monitoring: Implement robust monitoring systems to track the performance, health, and efficiency of deployed AI models and workflows. Troubleshoot issues in real-time and optimize system performance for seamless operations. Transferable knowledge from traditional software monitoring in production is fine. Monitoring knowledge of ML/GenAI workloads is preferred.

  • Maintain Knowledge Base: Good knowledge of database concepts ranging from performance tuning, RBAC, sharding, along with exposure to different types of databases from relational to object & vector databases is preferred. 

  • Collaboration with AI/ML Teams: Work closely with data scientists, machine learning engineers, and product teams to understand and support their platform requirements, ensuring the infrastructure is capable of meeting the needs of AI model deployment and experimentation.

  • Security & Compliance: Ensure that platform infrastructure is secure, compliant with organizational policies, and follows best practices for managing sensitive data and AI model deployment.

Required Skills & Qualifications:

  • Experience:

    • 8 to 12 years of experience in platform engineering, DevOps, or systems engineering, with a strong focus on machine learning and AI workloads.

    • Proven experience working with LLM workflows, and GPU-based machine learning infrastructure.

    • Hands-on experience in managing distributed computing systems, training large-scale models, and deploying AI systems in cloud environment.

    • Strong knowledge of GPU architectures (e.g., NVIDIA A100, V100, etc.), multi-GPU systems, and optimization techniques for AI workloads.

  • Technical Skills:

    • Proficiency in Linux systems and command-line tools. Strong scripting skills (Python, Bash, or similar).

    • Expertise in containerization and orchestration technologies (e.g., Docker, Kubernetes, Helm).

    • Experience with cloud platforms (AWS, GCP, Azure), tools such as Terraform, /Terragrunt, or similar infrastructure-as-code solutions, and exposure to automation of CICD pipelines using Jenkins/Gitlab/Github, etc.

    • Familiarity with machine learning frameworks (TensorFlow, PyTorch, etc.) and deep learning model deployment pipelines. Exposure to vLLM or NVIDIA software stack for data & model management is preferred.

    • Expertise in performance optimization tools and techniques for GPUs, including memory management, parallel processing, and hardware acceleration.
       

  • Soft Skills:

    • Strong problem-solving skills and ability to work on complex system-level challenges.

    • Excellent communication skills, with the ability to collaborate across technical and non-technical teams.

    • Self-motivated and capable of driving initiatives in a fast-paced environment.

Preferred Skills & Qualifications:

  • Experience in building or managing machine learning platforms, specifically for generative AI models or large-scale NLP tasks.

  • Familiarity with distributed computing frameworks (e.g., Dask, MPI, Pytorch DDP) and data pipeline orchestration tools (e.g., AWS Glue, Apache Airflow, etc).

  • Knowledge of AI model deployment frameworks such as TensorFlow Serving, TorchServe, vLLM, Triton Inference Server.

  • Good understanding of LLM inference & how to optimize self-managed infrastructure

  • Understanding of AI model explainability, fairness, and ethical AI considerations.

  • Experience in automating and scaling the deployment of AI models on a global infrastructure.

Preferred Experience:

  • Previously working on NVIDIA Ecosystem or well aware of NVIDIA Ecosystem - Triton Inference Server, CUDA, NVAIE, TensorRT, NeMo, etc

  • Good at Kubernetes (GPU Operator), Linux, and AI Deployment & experimentation tools.

If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!

What you need to know about the Bengaluru Tech Scene

Dubbed the "Silicon Valley of India," Bengaluru has emerged as the nation's leading hub for information technology and a go-to destination for startups. Home to tech giants like ISRO, Infosys, Wipro and HAL, the city attracts and cultivates a rich pool of tech talent, supported by numerous educational and research institutions including the Indian Institute of Science, Bangalore Institute of Technology, and the International Institute of Information Technology.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account