Saviynt

Senior Site Reliability Engineer

Posted 5 Days Ago

Be an Early Applicant

Hybrid

Bengaluru, Bengaluru Urban, Karnataka, IND

Senior level

Hybrid

Bengaluru, Bengaluru Urban, Karnataka, IND

Senior level

The Senior Site Reliability Engineer will enhance reliability and performance across Saviynt's AI-driven platform, focusing on AWS and Kubernetes operations, self-healing infrastructure, and operational tooling powered by LLMs.

The summary above was generated by AI

Saviynt's AI-powered identity platform manages and governs human and non-human access to all of an organization's applications, data, and business processes. Customers trust Saviynt to safeguard their digital assets, drive operational efficiency, and reduce compliance costs. Built for the AI age, Saviynt is today helping organizations safely accelerate their deployment and usage of AI. Saviynt is recognized as the leader in identity security, with solutions that protect and empower the world’s leading brands, Fortune 500 companies and government institutions. For more information, please visit www.saviynt.com.

We’re a fast-moving AI Security Company building AI-native infrastructure and applications powered by LLMs and autonomous agents. Our stack is deeply integrated with AWS, Kubernetes, and OpenAI-based systems, and we’re rethinking reliability in a world where software can reason, adapt, and self-heal.

We’re hiring a Senior SRE Engineer to own reliability across our cloud-native and AI-driven platform. You’ll work at the intersection of distributed systems, Kubernetes operations, and LLM-powered automation, building systems that don’t just scale—but think and fix themselves.

WHAT YOU BRING

5+ years in SRE / DevOps / Platform Engineering.
Strong hands-on experience with:

AWS infrastructure at scale
Kubernetes (production-grade clusters)
Proven ability to debug complex distributed systems under pressure.
Strong coding skills (Python or Go)—you build internal platforms and tools.
Experience implementing monitoring, alerting, and incident management systems.

Bonus (AI / LLM Focus)

Experience working with LLM APIs such as the OpenAI API.
Familiarity with agent frameworks like:

LangChain
AutoGen
Built or experimented with:

AI agents for DevOps / SRE workflows
Retrieval-Augmented Generation (RAG) systems
Vector databases (Pinecone, Weaviate, etc.)
Exposure to AIOps or intelligent automation systems.

WHAT YOU WILL BE DOING

Own uptime, reliability, and performance of services running on AWS + Kubernetes (EKS).
Design and implement self-healing infrastructure using automation and AI agents.
Build LLM-powered operational tooling using APIs such as the OpenAI API for:

Intelligent alert triage
Incident summarization
Root cause analysis
Runbook automation
Manage and scale Kubernetes workloads:

Deployments, autoscaling, resource optimization
Cluster reliability and cost efficiency
Build and evolve observability systems:

Metrics (Prometheus), dashboards (Grafana)
Logs (ELK / OpenSearch)
Tracing (OpenTelemetry)
Define and enforce SLOs, SLAs, and error budgets tied to business metrics.
Automate infrastructure using Terraform and CI/CD pipelines.
Lead incident response, postmortems, and continuous reliability improvements.
Introduce chaos engineering practices to proactively test system resilience.

If required for this role, you will:

- Complete security & privacy literacy and awareness training during onboarding and annually thereafter

- Review (initially and annually thereafter), understand, and adhere to Information Security/Privacy Policies and Procedures such as (but not limited to):

> Data Classification, Retention & Handling Policy

> Incident Response Policy/Procedures

> Business Continuity/Disaster Recovery Policy/Procedures

> Mobile Device Policy

> Account Management Policy

> Access Control Policy

> Personnel Security Policy

> Privacy Policy

Saviynt is an amazing place to work. We are a high-growth, Platform as a Service company focused on Identity Authority to power and protect the world at work. You will experience tremendous growth and learning opportunities through challenging yet rewarding work which directly impacts our customers, all within a welcoming and positive work environment. If you're resilient and enjoy working in a dynamic environment you belong with us!

Saviynt is an equal opportunity employer and we welcome everyone to our team. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status.

Top Skills

Autogen

AWS

Elk

Grafana

Kubernetes

Langchain

Opensearch

Opentelemetry

Prometheus

Python

Terraform

Similar Jobs

Optum

Senior Site Reliability Engineer

13 Days Ago

In-Office

Bengaluru, Bengaluru Urban, Karnataka, IND

Senior level

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics

Design, deploy, and maintain Kubernetes infrastructure; build CI/CD pipelines with GitHub Actions; provision infrastructure with Terraform; monitor systems with Prometheus and Grafana; automate workflows with Python; participate in on-call rotations.

Top Skills: Apache KafkaGithub ActionsGoogle Cloud PlatformGrafanaKubernetesPrometheusPythonTerraform

Sabre Corporation

Senior Site Reliability Engineer

12 Days Ago

In-Office

Bengaluru, Bengaluru Urban, Karnataka, IND

Senior level

Information Technology • Software • Travel

The role involves ensuring operational stability for critical applications, incident management, performance monitoring, and coordinating infrastructure changes. It requires collaboration with architecture and development teams to enhance software solutions and maintain documentation.

Top Skills: AnsibleCi/Cd ToolsGoogle Cloud Platform (Gcp)JIRAMongodbMonitoring & LoggingOraclePythonService NowShellSQLTerraformUnix/Linux

Verint

Senior Site Reliability Engineer

12 Days Ago

In-Office

Bangalore, Bengaluru Urban, Karnataka, IND

Senior level

Software • Automation

This Senior SRE Engineer will guide tech teams, provide expertise in SRE, DevOps, and Cloud operations, and mentor junior engineers while ensuring alignment with strategic goals.

Top Skills: AWSCi/CdDatadogDevOpsGCPPlatform EngineeringSreTerraform

What you need to know about the Bengaluru Tech Scene

Dubbed the "Silicon Valley of India," Bengaluru has emerged as the nation's leading hub for information technology and a go-to destination for startups. Home to tech giants like ISRO, Infosys, Wipro and HAL, the city attracts and cultivates a rich pool of tech talent, supported by numerous educational and research institutions including the Indian Institute of Science, Bangalore Institute of Technology, and the International Institute of Information Technology.