Seraph: Towards Scalable and Efficient Fully-external Graph Computation via On-demand Processing

Authors: 

Tsun-Yu Yang, Yizou Chen, Yuhong Liang, and Ming-Chang Yang, The Chinese University of Hong Kong

Abstract: 

Fully-external graph computation systems exhibit optimal scalability by computing the ever-growing, large-scale graph with constant amount of memory on a single machine. In particular, they keep the entire massive graph data in storage and iteratively load parts of them into memory for computation. Nevertheless, despite the merit of optimal scalability, their unreasonably-low efficiency often makes them uncompetitive, and even unpractical, to the other types of graph computation systems. The key rationale is that most existing fully-external graph computation systems over-emphasize retrieving graph data from storage through sequential access. Although this principle achieves high storage bandwidth, it often causes reading excessive and irrelevant data, which can severely degrade their overall efficiency.

Therefore, this work presents Seraph, a fully-external graph computation system that achieves optimal Scalability while toward satisfactory Efficiency improvement. Particularly, inspired by the modern storage offering comparable sequential and random access speeds, Seraph adopts the principle of on-demand processing to access the necessary graph data for saving I/O while enjoying the decent speed in random access. On the basis of this principle, Seraph further devises three practical designs to bring excellent performance leap to fully-external graph computation: 1) the hybrid format to represent the graph data for striking a good balance between I/O amount and access locality, 2) the vertex passing to enable efficient vertex updates on top of hybrid format, and 3) the selective pre-computation to re-use the loaded data for I/O reduction. Our evaluations reveal that Seraph notably outperforms other state-of-the-art fully-external systems under all the evaluated billion-scale graphs and representative graph algorithms by up to two orders of magnitude.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {294817,
author = {Tsun-Yu Yang and Yizou Chen and Yuhong Liang and Ming-Chang Yang},
title = {Seraph: Towards Scalable and Efficient Fully-external Graph Computation via On-demand Processing},
booktitle = {22nd USENIX Conference on File and Storage Technologies (FAST 24)},
year = {2024},
isbn = {978-1-939133-38-0},
address = {Santa Clara, CA},
pages = {373--387},
url = {https://www.usenix.org/conference/fast24/presentation/yang-tsun-yu},
publisher = {USENIX Association},
month = feb
}

Presentation Video