Lead Engineer, Backend at Givebutter

@ hello@givebutter.com
Gmail: 📧 Copy: 📋 Bounce: 🚫

Lead Engineer, Backend

📅 06/10/2023

Apply

Account Executive

💰 $120,000 - $140,000 📅 10/02/2023

Apply

Engineering Team Lead, Site Reliability Engineer

💰 $200 - $7 📅 07/26/2024

Apply

Job Description

Role Description
Givebutter is hiring a New York City-based Site Reliability Team Lead to
oversee the reliability, scalability, and performance of our systems. As a
Lead SRE, you will be directly responsible for delivering world-class
infrastructure to our users, maturing our operational practices, and leading a
team of skilled engineers. You will report directly to our CTO and carry out
our infrastructure vision while creating a scalable engineering culture that
breeds innovation. You will ensure we are delivering excellent user
experiences in a timely manner and retain top-notch security, design, and
performance. You will cultivate a culture of high performance by creating
systems that eliminate roadblocks, processes that incentivize excellence, and
by being an expert in site reliability engineering. We have already built a
great foundation, powering hundreds of millions of donations to over 10k+
organizations and you will take this impact much further.

Why join the Givebutter Engineering team?
Democracy of code - We are a group of engineers that values equal contribution
as well as discussing architecture and ideas openly.
Not overburdened with meetings - Our Engineers manage their own calendars and
block times so they can work uninterrupted.
Automated ci/cd - Our builds are reproducible and the pipeline is easy to
manage. Shipping to production is hands-off, automated, and consistent. Our
engineers are focused on solving problems with code.
Mission-driven, full stop - We work with amazing organizations, non-profits,
and charities doing good all over the world.

Responsibilities

* Manage and hire in-house SREs and contractor resources
* Handle and prioritize incidents, ensuring timely resolution and effective communication.
* Establish and manage key metrics for reliability; set up and maintain alerting systems.
* Automate tasks and manage infrastructure using Infrastructure as Code (IaC) tools and techniques.
* Ensure application scalability and identify performance bottlenecks to optimize system performance.
* Design and implement fault-tolerant and highly available systems to minimize downtime.
* Develop, implement, and regularly test disaster recovery plans to ensure business continuity.
* Conduct capacity planning to anticipate and manage future infrastructure needs.
* Define, measure, and maintain SLOs and SLAs to meet service performance expectations.
* Ensure the security of applications through best practices and conduct regular penetration tests to identify and mitigate vulnerabilities.

Requirements

* 5+ years of experience building and deploying production infrastructure at scale
* 5+ years experience working with AWS
* Knowledge of PHP
* Aware of trends and best practices in SRE and cloud infrastructure
* 2+ years of experience managing system architecture, ensuring best practices for reliability, performance, and security
* Strong technical leadership, mentorship, and communication skills
* Experience working for a product-led growth company is beneficial
* Experience managing a remote engineering team

Skills:

Best Practices

Capacity Planning

CI/CD

Communication

Disaster Recovery

Security