Adobe Logo

Adobe

Site Reliability Engineer 5

Reposted 5 Days Ago
Be an Early Applicant
In-Office
Bangalore, Bengaluru Urban, Karnataka
Expert/Leader
In-Office
Bangalore, Bengaluru Urban, Karnataka
Expert/Leader
The Site Reliability Engineer will define reliability strategies, build automation frameworks, lead incident response, and mentor teams to enhance system resilience and efficiency.
The summary above was generated by AI

Our Company
Changing the world through digital experiences is what Adobe’s all about. We give everyone—from emerging artists to global brands—everything they need to design and deliver exceptional digital experiences! We’re passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies interact with customers across every screen. 
We’re on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity. We realize that new ideas can come from everywhere in the organization, and we know the next big idea could be yours!


 

**Adobe Pass** is a leading authentication and authorization platform that enables seamless access to premium TV and video content across devices. It powers “TV Everywhere” experiences by allowing users to sign in with their pay-TV credentials to watch subscribed content from broadcasters and streaming services. Trusted by major media companies, Adobe Pass ensures secure, scalable, and frictionless user authentication, while providing insights and analytics that help content providers deliver personalized and compliant viewing experiences.

System Architecture & Technical Strategy

  • Define and drive the long-term reliability and scalability strategy for the Adobe Pass platform, aligning with product and business goals.

  • Architect large-scale, distributed, and multi-region systems designed for resiliency, observability, and self-healing.

  • Anticipate systemic risks and design proactive mitigation strategies — ensuring zero single points of failure across critical services.

  • Partner with software architecture and infrastructure teams to evolve the platform toward greater reliability, efficiency, and cost optimization.

Automation, Observability & Reliability Engineering

  • Build and champion advanced automation frameworks that enable zero-touch operations across deployment, recovery, and scaling workflows.

  • Introduce AI/ML-based predictive monitoring and anomaly detection systems to anticipate failures before they impact users.

  • Lead organization-wide reliability initiatives — such as chaos engineering, error budgets, and SLO adoption — driving measurable reliability improvements.

  • Continuously refine observability architecture (metrics, traces, logs) to ensure comprehensive, actionable insights into production health.

Incident Response & Operational Excellence

  • Serve as a technical authority during high-impact incidents, guiding cross-functional teams through real-time mitigation and long-term prevention.

  • Establish and enforce best-in-class incident management frameworks, improving MTTR, MTBF, and reducing incident recurrence rates.

  • Lead blameless postmortems and translate findings into actionable reliability roadmaps.

  • Drive reliability reviews and operational readiness assessments for all major product launches.

Performance, Scalability & Cost Efficiency

  • Lead large-scale performance tuning and capacity engineering efforts, ensuring optimal resource utilization and cost efficiency across environments.

  • Identify architectural bottlenecks, drive performance benchmarking, and influence platform evolution for better scalability and elasticity.

  • Partner with FinOps and CloudOps to optimize spend while maintaining reliability SLAs and SLOs.

Cross-Team Leadership & Mentorship

  • Mentor and coach SREs and software engineers, cultivating deep reliability-first thinking across teams.

  • Serve as a thought leader in reliability engineering — driving best practices, evangelizing automation-first culture, and influencing technical standards across multiple teams.

  • Collaborate with engineering leaders, PMs, and operations to align priorities, set strategic goals, and deliver on high-impact reliability initiatives.

  • Lead technical deep dives and design reviews, ensuring all systems are built to scale securely and reliably.

Qualifications
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.

  • 12+ years of experience in site reliability, production engineering, or large-scale distributed system operations.

  • Proven track record of designing and managing highly available, globally distributed systems in cloud-native environments (AWS, Azure, GCP).

  • Expert-level proficiency in one or more programming/scripting languages (Python, Go, Java, Bash) for automation and tooling.

  • Deep understanding of Kubernetes, microservices, and service mesh architectures.

  • Advanced experience with Infrastructure as Code (Terraform, CloudFormation) and CI/CD automation frameworks.

  • Mastery in observability and monitoring stacks (Prometheus, Grafana, Datadog, OpenTelemetry).

  • Strong expertise in networking, storage, and distributed databases (both SQL and NoSQL).

  • Demonstrated ability to influence architectural decisions and drive reliability strategy across organizations.

  • Exceptional communication, leadership, and stakeholder management skills.

Preferred Qualifications
  • Experience designing reliability frameworks or SRE platforms at scale (error budgets, chaos engineering, reliability reviews).

  • Prior experience in high-traffic or latency-sensitive systems (media streaming, advertising, or real-time platforms).

  • Familiarity with big data ecosystems (Kafka, Spark, Hadoop) and large-scale data ingestion pipelines.

  • Hands-on experience with security, compliance, and governance in production environments (SOC2, GDPR, ISO27001).

  • Cloud or Kubernetes certifications (AWS Solutions Architect Professional, CKA/CKAD, GCP Professional Cloud Architect).

  • Published contributions or conference talks on reliability, automation, or distributed systems.

Adobe is proud to be an Equal Employment Opportunity employer. We do not discriminate based on gender, race or color, ethnicity or national origin, age, disability, religion, sexual orientation, gender identity or expression, veteran status, or any other applicable characteristics protected by law. Learn more.

Adobe aims to make Adobe.com accessible to any and all users. If you have a disability or special need that requires accommodation to navigate our website or complete the application process, email [email protected] or call (408) 536-3015.

Top Skills

AWS
Azure
Bash
CloudFormation
Datadog
GCP
Go
Grafana
Java
Kubernetes
Opentelemetry
Prometheus
Python
Terraform

Adobe Anekal, Karnataka, IND Office

Prestige Platina Tech Park Marathahalli-Sarjapur Outer Ring Rd, Anekal, Karnataka, India, 560103

Adobe Bangalore, Karnataka, IND Office

Prestige Platina Technology Park, Building 1, Block A , India , Bangalore, Bangalore, India, 560103

Adobe Bengaluru, Karnataka, IND Office

Prestige Platina Technology Park Building 1, Block A, Bengaluru, Karnataka, India, 560103

Similar Jobs

6 Days Ago
Hybrid
Bengaluru, Bengaluru Urban, Karnataka, IND
Mid level
Mid level
Financial Services
As a Site Reliability Engineer II, you will maintain system reliability, write robust code, resolve incidents, and improve observability solutions.
Top Skills: Cloud TechnologiesDatadogDynatraceGrafanaLinuxPrometheusSplunkWindows
4 Days Ago
In-Office
7 Locations
Senior level
Senior level
Information Technology • Software
The Senior Site Reliability Engineer is responsible for managing cloud infrastructure, optimizing database performance, automating system processes, and ensuring the reliability of integration solutions across enterprise systems.
Top Skills: .NetAzureAzure DevopsC#DockerKubernetesMs SqlOopPowershellTerraformWeb Api
5 Days Ago
Hybrid
Bengaluru, Karnataka, IND
Senior level
Senior level
Software
The Senior Site Reliability Engineer is responsible for creating reliable infrastructure, enhancing observability, automating deployment, and collaborating on cloud services implementation.
Top Skills: AnsibleAWSAzureDatadogDockerElk/OpensearchGenerative AiGoGrafanaJavaJfrogKafkaKubernetesLinuxMlopsOpenstackOpentelemetryPostgresPrometheusPythonRabbitMQShellTerraform

What you need to know about the Bengaluru Tech Scene

Dubbed the "Silicon Valley of India," Bengaluru has emerged as the nation's leading hub for information technology and a go-to destination for startups. Home to tech giants like ISRO, Infosys, Wipro and HAL, the city attracts and cultivates a rich pool of tech talent, supported by numerous educational and research institutions including the Indian Institute of Science, Bangalore Institute of Technology, and the International Institute of Information Technology.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account