Storage Systems in the LLM Era

Moderator: Keith A. Smith, MongoDB
Panelists: Greg Ganger, Carnegie Mellon University; Dean Hildebrand, Google; Glenn Lockwood, Microsoft; Nisha Talagala, Pyxeda; Zhe Zhang, AnyScale

Abstract: 

This one-hour panel discussion will be centered around the new challenges and opportunities brought by revolutionary AI technologies to the storage community in terms of research, system development, management, and education. Panelists will include experienced researchers and entrepreneurs from multiple storage-related systems fields. The panel will be moderated by Keith Smith from MongoDB.

Greg Ganger, Carnegie Mellon University

Greg Ganger is the Jatras Professor of ECE and CS (by courtesy) at Carnegie Mellon University (CMU). Since 2001, he has also served as the Director of CMU's Parallel Data Laboratory (PDL) research center focused on data storage and processing systems. He has broad research interests in computer systems, including storage/file systems, cloud computing, ML systems, distributed systems, and operating systems. He earned his collegiate degrees from the University of Michigan and did a postdoc at MIT before joining CMU. He still loves playing basketball... he's lost a step but developed a sweet 3-point shot.

Dean Hildebrand, Google

Dean Hildebrand is a Technical Director for storage in the Google Cloud Office of the CTO. His interests span the spectrum of distributed storage, spending an inordinate amount of time on NFS, GPFS, and most recently DAOS. Prior to Google, Dean was a Principal Research Staff Member at IBM Research. He completed his Ph.D. in computer science from the University of Michigan in 2007.

Glenn Lockwood, Microsoft

Glenn K. Lockwood is a Principal Engineer at Microsoft, where he is responsible for supporting Microsoft's largest AI supercomputers through workload-driven systems design. His work has focused on applied research and development in extreme-scale and parallel computing systems for high-performance computing, and he has specific expertise in scalable architectures, performance modeling, and emerging technologies for I/O and storage. Prior to joining Microsoft, Glenn led the design and validation of several large-scale storage systems, including the world's first 30+ PB all-NVMe Lustre file system for the Perlmutter supercomputer at NERSC. He has also authored numerous peer-reviewed papers and reports on HPC storage topics and contributed to various reviews and advisory roles in U.S. and European research programs. He holds a Ph.D. in Materials Science from Rutgers University.

Nisha Talagala, Pyxeda

Nisha Talagala is the CEO and founder of AIClub.World, which is bringing AI Literacy to K-12 students and individuals worldwide. Nisha has significant experience in introducing technologies like Artificial Intelligence to new learners from students to professionals. Previously, Nisha co-founded ParallelM (acquired by DataRobot), which pioneered the MLOps practice of managing Machine Learning in production for enterprises. Nisha is a recognized leader in the operational machine learning space, having also driven the USENIX Operational ML Conference, the first industry/academic conference on production AI/ML. Nisha was previously a Fellow at SanDisk and a Fellow/Lead Architect at Fusion-io, where she worked on innovation in non-volatile memory technologies and applications. Nisha has over 20 years of expertise in enterprise software development, distributed systems, technical strategy, and product leadership. She has worked as technology lead for server flash at Intel—where she led server platform non-volatile memory technology development, storage-memory convergence, and partnerships. Prior to Intel, Nisha was the CTO of Gear6, where she designed and built clustered computing caches for high-performance I/O environments. Nisha earned her Ph.D. at UC Berkeley where she did research on clusters and distributed systems. Nisha holds 75 patents in distributed systems and software, over 25 refereed research publications, is a frequent speaker at industry and academic events, and is a contributing writer to Forbes and other publications.

Zhe Zhang, AnyScale

Zhe Zang is currently the Head of Open Source Engineering (Ray.io project) at Anyscale. Before Anyscale, Zhe spent 4.5 years at LinkedIn where he managed the Hadoop/Spark infra team. He has been working on open source for about 10 years; he's a committer and PMC member of the Apache Hadoop project, and a member of the Apache Software Foundation.

BibTeX
@conference {294824,
author = {Greg Ganger and Dean Hildebrand and Glenn Lockwood and Nisha Talagala and Zhe Zhang},
title = {Storage Systems in the {LLM} Era},
year = {2024},
address = {Santa Clara, CA},
publisher = {USENIX Association},
month = feb
}

Presentation Video