Alpheya

Senior Site Reliability Engineer - Data & Application Reliability

Reposted 16 Days Ago

Be an Early Applicant

In-Office

Bengaluru, Bengaluru Urban, Karnataka, IND

Senior level

In-Office

Bengaluru, Bengaluru Urban, Karnataka, IND

Senior level

The Senior SRE will manage and debug data ingestion pipelines, work with ETL/ELT, optimize data monitoring, and automate tasks with scripting.

The summary above was generated by AI

We are a B2B WealthTech startup based in Abu Dhabi and backed by BNY Mellon (America’s oldest bank and first company to list on NYSE) and Lunate (a new $50B AUM alternative asset management firm based in Abu Dhabi, UAE). The company has raised $300M to build a state of the art wealth technology platform.

Our mission is to power and grow our clients’ Wealth franchises through differentiated experiences, financial solutions, and insights. Our digital wealth management platform- will enable banks and other financial institutions in the Middle East to grow and further penetrate affluent, HNW and UHNW investor segments.

While still leveraging the capabilities and knowledge of large organizations, our fintech is a startup with truly cross-functional and agile teams.

For more information, please visit www.alpheya.com

Role

We are building an SRE team that owns production reliability end-to-end across data ingestion pipelines + backend services + Kubernetes deployments + observability.

This is not a tickets-only ops role. You will debug complex production issues, ship permanent fixes (code/config), and harden the system so issues don’t repeat. You’ll work across ingestion/ETL (Snowflake and other sources), application services (Go/Node.js), and platform operations (Kubernetes/Helm), with strong emphasis on incident response and reliability engineering.

What you’ll do

Own production reliability for ingestion workflows end-to-end including SLAs and incident response.
Lead and execute incident response for ingestion + application failures: triage, mitigation, stakeholder comms, and coordination across teams.
Debug and resolve ingestion and data mapping issues (client-specific FinTech files, schema changes, edge cases) and ensure correctness post-fix.
Operate ingestion services/workers on Kubernetes: troubleshoot rollouts, config/secrets, scaling, resource bottlenecks, node/pod issues, and runtime failures.
Handle data recovery safely: replays/backfills, idempotency checks, dedupe strategies, reconciliation queries, and data-quality validation.
Diagnose database issues (PostgreSQL/CNPG): performance bottlenecks, locks, indexing, query tuning, migrations, and operational risks.
Build ingestion + application observability: dashboards and alerts for freshness, throughput, lag, error rate, retries, DLQ volume, processing latency, and per-tenant success metrics.
Drive prevention: improve runbooks/service passports, post-deploy validation, regression testing, and operational standards.
Partner with application/data engineers on schema evolution, data contracts, and reliability patterns (timeouts, retries, backpressure, safe degradation).

Requirements

Strong knowledge of SQL and PostgreSQL.
Ability to debug production issues across data + backend services + infrastructure, not just within one layer.
Working understanding of backend systems in Go (preferred) and Node.js: able to navigate codebases, follow request flows, debug production issues, and contribute small-to-medium fixes (not only scripts).
Working understanding of distributed backend systems and APIs (GraphQL + gRPC/RPC): able to follow request flows across services, identify contract/schema issues, and troubleshoot latency/error patterns end-to-end.
Experience with ETL/ELT pipelines and messaging systems.
Understanding of data formats (CSV, JSON, Parquet).
Familiarity with MySQL and Snowflake.
Exposure to Kubernetes, Docker, AKS for running data jobs.
Ability to debug ingestion errors and runtime failures.

Good to Have

Experience with Temporal workflows or distributed systems.
Prior exposure to observability stacks (Prometheus, Grafana, Loki, Tempo).
Interest in transitioning towards SRE/Platform engineering.

Top Skills

Aks

BigQuery

Docker

Grafana

Kafka

Kubernetes

MySQL

Postgres

Prometheus

Python

Shell

Snowflake

Spark

SQL

Similar Jobs

Wells Fargo

Operations Associate

An Hour Ago

Hybrid

Bengaluru, Bengaluru Urban, Karnataka, IND

Entry level

Fintech • Financial Services

The Securities Operations Associate supports UIT Operations by managing daily trade functions, resolving operational issues, and enhancing process efficiency.

Top Skills: BetaDtc ApplicationsExcelMS Office

Wells Fargo

Lead Software Engineer

An Hour Ago

Hybrid

Bengaluru, Bengaluru Urban, Karnataka, IND

Senior level

Fintech • Financial Services

Lead complex technology initiatives, develop engineering standards, design and document technology solutions, mentor teams, and resolve technical issues in a fast-paced environment.

Top Skills: Ci/CdCloud-Native DesignGenai TechnologiesGitJavaJenkinsKubernetesLangchainLlamaindexMicroservicesPythonSpring Boot

Wells Fargo

Lead Software Engineer

An Hour Ago

Hybrid

Bengaluru, Bengaluru Urban, Karnataka, IND

Senior level

Fintech • Financial Services

Lead complex technology initiatives and develop engineering standards. Design, code, test, and mentor engineers in building scalable backend systems and adopting Generative AI.

Top Skills: AngularApache KafkaDockerGenerative AiJavaKubernetesMongoDBReactSpring Boot

What you need to know about the Bengaluru Tech Scene

Dubbed the "Silicon Valley of India," Bengaluru has emerged as the nation's leading hub for information technology and a go-to destination for startups. Home to tech giants like ISRO, Infosys, Wipro and HAL, the city attracts and cultivates a rich pool of tech talent, supported by numerous educational and research institutions including the Indian Institute of Science, Bangalore Institute of Technology, and the International Institute of Information Technology.