Tracking Service Infrastructure at Scale

Monday, March 13, 2017 - 2:50pm3:15pm

John Arthorne, Shopify

Abstract: 

Over the past two years Shopify built up a strong SRE team focused on the critical systems at the core of the company’s business. However at the same time the company was steadily growing a list of secondary production services, to the point where we had several hundred running applications, most of which had poorly defined ownership and ad-hoc infrastructure. These applications had a patchwork of tools such as build and deployment automation, monitoring, alerting, and load testing, but with very little consistency and lots of gaps. 

Shopify avoided this looming disaster through automation. We built an application that tracked every production service, and linked it to associated owners, source code, and supporting infrastructure. This application, called Services DB, provides a central place where developers can discover and manage their services across various runtime environments. Services DB also provides a platform for measuring progress on infrastructure quality and building out additional tooling to automate manual steps in infrastructure management. Today, Shopify SREs aren’t worried about being woken up in the middle of the night because of failures in poorly maintained applications. Instead, SREs can focus on building automation, and use Services DB to apply it across large groups of services.

John Arthorne, Shopify

John works on the Shopify Production Engineering team, with a specific focus on creating developer tooling to accelerate application delivery. John is a frequent speaker at technical conferences in both Europe and North America, serves on conference program committees, is a JavaOne Rock Star, and frequently writes blogs and articles on technical topics. His current interests are in tools and practices for infrastructure automation, and in highly scalable cloud architectures. Before joining Shopify, John led a team building cloud-based developer tooling for IBM Bluemix, and was a prominent leader within the Eclipse open source community.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {201797,
author = {John Arthorne},
title = {Tracking Service Infrastructure at Scale},
year = {2017},
address = {San Francisco, CA},
publisher = {USENIX Association},
month = mar
}

Presentation Video 

Presentation Audio