In this episode of Maintainable, Robby is joined by Brian Scanlan, Principal Systems Engineer at Intercom. Brian shares insights into how Intercom has successfully implemented a volunteer-led on-call support system, emphasizing the importance of meaningful alarms and a consistent software environment.
Summary of Topics
- [00:05:32] Introduction to the Guest's Background: Brian Scanlan discusses his role at Intercom and his experience with on-call practices.
- [00:15:10] Developing a Volunteer-Led On-Call System: Brian explains how Intercom transitioned to a volunteer-led on-call system, ensuring that alarms are meaningful and actionable.
- [00:20:00] The Role of Consistent Software Architecture: The impact of Intercom’s monolithic Ruby on Rails architecture on simplifying on-call duties.
- [00:29:46] Managing Technical Debt as Velocity Risks: Brian describes how Intercom manages technical debt through a velocity risk framework.
- [00:38:45] Improving Deployment Processes: The evolution of Intercom’s deployment processes, reducing the time from merge to production.
- [00:43:32] Treating Internal Tools as a Product: The importance of treating internal tools with the same care as external products, focusing on usability and impact.
- [00:50:56] Encouraging Small Wins in Productivity: How Intercom encourages engineers to address small productivity issues to prevent larger problems.
- [00:51:39] Balancing Innovation with Stability: Intercom’s conservative approach to engineering and how it helps maintain a stable product.
Key Takeaways
- Meaningful Alarms: Ensure that all alarms are actionable and represent real or inevitable customer pain.
- Consistent Architecture: A consistent software environment, like Intercom's Ruby on Rails monolith, simplifies on-call duties and allows for greater flexibility across teams.
- Velocity Risk Framework: Managing technical debt by quantifying its impact on velocity helps prioritize the most impactful work.
- Continuous Improvement: Regular reviews and continuous improvement are essential for maintaining a sustainable on-call system.
- Product-Focused Engineering: At Intercom, the emphasis is on building products, not just writing code, ensuring that engineers are focused on delivering value.
Helpful Links
- Intercom's Engineering Site
- Brian's Twitter
- Brian's LinkedIn
- [Book Recommendation] Choice Theory: A New Psychology of Personal Freedom, William Glasser
Subscribe to Maintainable on:
Or search "Maintainable" wherever you stream your podcasts.
Keep up to date with the Maintainable Podcast by joining the newsletter.
03/29/21 • 53 min
Generate a badge
Get a badge for your website that links back to this episode
<a href="https://goodpods.com/podcasts/maintainable-333012/brian-scanlan-improving-oncall-support-with-meaningful-alarms-48620053"> <img src="https://storage.googleapis.com/goodpods-images-bucket/badges/generic-badge-1.svg" alt="listen to brian scanlan: improving oncall support with meaningful alarms on goodpods" style="width: 225px" /> </a>
Copy