SRE Classroom, or How to Build a Distributed System in 3 Hours

Salim Virji, Laura Nolan, and Phillip Tischler, Google LLC

Abstract: 

This workshop ties together academic and practical aspects of systems engineering, with an emphasis on applying principles of systems design to a production service. We will analyze the service to quantify its performance, and iteratively improve the design.

Participants will work together in small groups to sketch out the design, identify components and their relationships.

Participants will have a system design and bill of materials at the conclusion of this workshop.

Participants will not need laptops or specific coding experience; participants will need enthusiasm for collaborating in small groups, and for discussion-based problem-solving.

Pre-Reading List:

  • CAP Twelve Years Later
  • Raft
  • Distributed systems in production environments
    • The Google File System
    • The Chubby Lock Service for Loosely-Coupled Distributed Systems
  • Microsevices
  • Latency Numbers That Are Good To Know
  • SRE Book Chapters
    • Service Level Objectives
    • Load Balancing at the Frontend
    • Load Balancing in the Datacenter
    • Managing Critical State: Distributed Consensus for Reliability
    • Data Integrity: What You Read Is What You Wrote

Prerequisites, Skills, and Tools:

  • Participants will need enthusiasm for collaborating in small groups, and for discussion-based problem-solving
  • Participants will work together, using pencil and paper
  • Participants ought to be familiar with order-of-magnitude comparisons

Salim Virji, Google LLC

Salim Virji is a Site Reliability Engineer at Google, where he has built distributed compute, consensus, and storage systems.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {212983,
author = {Salim Virji and Laura Nolan and Phillip Tischler},
title = {{SRE} Classroom, or How to Build a Distributed System in 3 Hours},
year = {2018},
address = {Santa Clara, CA},
publisher = {USENIX Association},
month = mar
}