Difference between revisions of "CATS-Dec-5-2014"

From Theory
Line 6: Line 6:
  
 
== Abstract ==
 
== Abstract ==
In the first part of the talk, we discuss Ligra, a shared-memory graph processing system that we recently developed. The framework has two very simple routines, one for mapping over edges and one for mapping over vertices. The routines can be applied to any subset of the vertices, which makes the framework useful for many graph traversal algorithms that operate on subsets of the vertices. Based on recent ideas used in a very fast algorithm for breadth-first search (BFS), the routines automatically adapt to the density of vertex sets.  We implement several algorithms in this framework, including BFS, betweenness centrality, graph radii estimation, graph connectivity, PageRank and single-source shortest paths.  Our algorithms expressed using this framework are very simple and concise, and perform almost as well as highly optimized code. Furthermore, they get good speedups on a 40-core machine and are sometimes significantly more efficient than previously reported results using graph frameworks on machines with many more cores.
+
In the first part of the talk, we will discuss our recent work on developing a practical linear-work parallel algorithm for graph connectivity. Sequentially, connectivity can be done in linear work easily using breadth-first search or depth-first search. There have been many parallel algorithms for connectivity, however the simpler parallel algorithms require super-linear work, and the linear-work polylogarithmic-depth parallel algorithms are very complicated and not amenable to implementation. We address this gap by describing a simple and practical expected linear-work, polylogarithmic-depth parallel algorithm for graph connectivity. Our algorithm is based on a recent parallel algorithm for generating low-diameter graph decompositions by Miller et al., which uses parallel breadth-first searches. We experimentally compare our implementation of the algorithm against the fastest existing parallel connectivity implementations (which are not theoretically linear-work and polylogarithmic-depth) and show that our implementation is competitive for various inputs, and achieves good speedup relative to sequential algorithms. Our algorithm is the first parallel connectivity algorithm that is both theoretically and practically efficient.
  
In the second part of the talk, we discuss Ligra+, which integrates graph compression techniques into Ligra. On average, Ligra+ is able to represent graphs using about half of the space used by Ligra (which stores the graphs uncompressed). On a 40-core machine, the performance of Ligra+ is usually competitive with or faster than Ligra due to its smaller memory footprint.  Our extensive experimental study shows that Ligra+ is able to support graphs using less memory, or much larger graphs using the same amount of memory, while performing as well as or faster than Ligra. As far as we know, Ligra+ is the first graph processing system to support a reasonably broad set of parallel graph algorithms on compressed graphs.
+
In the second part of the talk, we will discuss Ligra, a shared-memory graph processing system that we recently developed. The framework has two very simple routines, one for mapping over edges and one for mapping over vertices. The routines can be applied to any subset of the vertices, which makes the framework useful for many graph traversal algorithms that operate on subsets of the vertices. Based on recent ideas used in a very fast algorithm for breadth-first search (BFS), the routines automatically adapt to the density of vertex sets.  We implement several algorithms in this framework, including BFS, betweenness centrality, graph radii estimation, graph connectivity, PageRank and single-source shortest paths.  Our algorithms expressed using this framework are very simple and concise, and perform almost as well as highly-optimized code. Furthermore, they get good speedups on multicore machines and are sometimes significantly more efficient than previously reported results using graph frameworks on machines with many more cores.

Revision as of 19:51, 2 December 2014

Title[edit]

Large-Scale Parallel Graph Algorithms

Speaker[edit]

Julian Shun

Abstract[edit]

In the first part of the talk, we will discuss our recent work on developing a practical linear-work parallel algorithm for graph connectivity. Sequentially, connectivity can be done in linear work easily using breadth-first search or depth-first search. There have been many parallel algorithms for connectivity, however the simpler parallel algorithms require super-linear work, and the linear-work polylogarithmic-depth parallel algorithms are very complicated and not amenable to implementation. We address this gap by describing a simple and practical expected linear-work, polylogarithmic-depth parallel algorithm for graph connectivity. Our algorithm is based on a recent parallel algorithm for generating low-diameter graph decompositions by Miller et al., which uses parallel breadth-first searches. We experimentally compare our implementation of the algorithm against the fastest existing parallel connectivity implementations (which are not theoretically linear-work and polylogarithmic-depth) and show that our implementation is competitive for various inputs, and achieves good speedup relative to sequential algorithms. Our algorithm is the first parallel connectivity algorithm that is both theoretically and practically efficient.

In the second part of the talk, we will discuss Ligra, a shared-memory graph processing system that we recently developed. The framework has two very simple routines, one for mapping over edges and one for mapping over vertices. The routines can be applied to any subset of the vertices, which makes the framework useful for many graph traversal algorithms that operate on subsets of the vertices. Based on recent ideas used in a very fast algorithm for breadth-first search (BFS), the routines automatically adapt to the density of vertex sets. We implement several algorithms in this framework, including BFS, betweenness centrality, graph radii estimation, graph connectivity, PageRank and single-source shortest paths. Our algorithms expressed using this framework are very simple and concise, and perform almost as well as highly-optimized code. Furthermore, they get good speedups on multicore machines and are sometimes significantly more efficient than previously reported results using graph frameworks on machines with many more cores.