Arcesium Logo

Arcesium

SRE Lead - Distributed Systems

Posted 2 Days Ago
Be an Early Applicant
2 Locations
Senior level
2 Locations
Senior level
As an SRE Manager at Arcesium, you will lead the deployment and maintenance of a high-availability distributed system, automate infrastructure and application deployment, design system architecture, and ensure performance. You will also program within the core application and monitor system health.
The summary above was generated by AI

Company Overview

Arcesium is a global financial technology firm that solves complex data-driven challenges faced by some of the world’s most sophisticated financial institutions. We constantly innovate our platform and capabilities to meet tomorrow’s challenges, anticipate the risks our clients encounter, and design advanced solutions to help our clients achieve transformational business outcomes.   

Financial technology is a high-growth industry as change and innovation continue to disrupt the status-quo and prompt major transformation. Arcesium is at a particularly interesting time in our own growth as we look to leverage our successfully established market position and expand operations in pursuit of strategic new business opportunities. We value intellectual curiosity, proactive ownership, and collaboration with colleagues, and we empower you to meaningfully contribute from day one and accelerate your professional development.

Arcesium seeks a highly skilled Site Reliability Engineer to join our Technology team. You will be working as part of a cross-functional product team to create elegant solutions to highly complex and intricate business challenges.

What you'll do:

  • Working with the rest of the team to deploy, maintain, and run a highly-available, multi-tenant distributed system
  • Automating both the infrastructure creation and the application deployment to that environment.
  • Contributing to the design/architecture of the system
  • Programming in the core application (ex: instrumenting code with monitoring metrics, setting up traces, shipping and organizing logs)
  • Ensuring the system performs as intended

What you'll need:

  • The ideal candidate will have at least 6 years of experience in a SRE/Operations/DevOps role running distributed systems in production.
  • Experience with automated provisioning and management of AWS infrastructure and services
  • Strong knowledge of Linux systems internals and administration
  • Deep experience with Kubernetes and Docker
  • Experience automating the software dev/test/deployment lifecycle with continuous integration and continuous deployment
  • Experience with scaling, monitoring, and troubleshooting actively running systems
  • Ability to program in Java, C++, or C#
  • Comfortable with configuration management tools: Ansible, Chef, Puppet, etc.
  • Other technologies: Fluentd, Key-Val datastores, API management/service meshes, Git, Key management

Arcesium and its affiliates do not discriminate in employment matters on the basis of race, color, religion, gender, gender identity, pregnancy, national origin, age, military service eligibility, veteran status, sexual orientation, marital status, disability, or any other category protected by law. Note that for us, this is more than just a legal boilerplate. We are genuinely committed to these principles, which form an important part of our corporate culture, and are eager to hear from extraordinarily well qualified individuals having a wide range of backgrounds and personal characteristics.

Top Skills

C#
C++
Java

Similar Jobs

2 Days Ago
Gurgaon, Gurugram, Haryana, IND
Senior level
Senior level
eCommerce • Retail
As Lead Engineer at dunnhumby, you'll lead a Site Reliability Engineering team to ensure the reliability and performance of cloud services, implement infrastructure automation, manage critical systems, and mentor junior engineers while fostering collaboration and continuous improvement.
Top Skills: BashGoPython
4 Days Ago
Hybrid
Gurgaon, Gurugram, Haryana, IND
Mid level
Mid level
Logistics
As a Site Reliability Engineer at KlearNow, you will design and maintain scalable infrastructure, automate monitoring and alerting for production systems, debug production issues, and support internal and external performance indicators. You'll collaborate with DevOps and provide insights to various teams while being on-call for 24/7 service.
Top Skills: BashPython
2 Days Ago
2 Locations
Mid level
Mid level
Insurance • Financial Services
The Site Reliability Engineer will support monitoring tools, respond to incidents, and collaborate with teams to develop and implement solutions. Responsibilities include setting up and maintaining monitoring tools, automation, issue resolution, and creating technical documentation while focusing on project delivery.
Top Skills: AnsibleBashPowershellPython

What you need to know about the Bengaluru Tech Scene

Dubbed the "Silicon Valley of India," Bengaluru has emerged as the nation's leading hub for information technology and a go-to destination for startups. Home to tech giants like ISRO, Infosys, Wipro and HAL, the city attracts and cultivates a rich pool of tech talent, supported by numerous educational and research institutions including the Indian Institute of Science, Bangalore Institute of Technology, and the International Institute of Information Technology.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account