Machine-aware Atomic Broadcast Trees for Multicores

Authors

Stefan Kaestle, Reto Achermann, Roni Haecki, Moritz Hoffmann, Sabela Ramos and Timothy Roscoe

Venue

Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI'16)

Links

[ .doi ] [ .pdf ] [ .bib ]

Abstract

The performance of parallel programs on multicore machines often critically depends on group communication operations like barriers and reductions being highly tuned to hardware, a task requiring considerable developer skill.

Smelt is a library that automatically builds efficient inter-core broadcast trees tuned to individual machines, using a machine model derived from hardware registers plus micro-benchmarks capturing the low-level machine characteristics missing from vendor specifications.

Experiments on a wide variety of multicore machines show that near-optimal tree topologies and communication patterns are highly machine-dependent, but can nevertheless be derived by Smelt and often further improve performance over well-known static topologies.

Furthermore, we show that the broadcast trees built by Smelt can be the basis for complex group operations like global barriers or state machine replication, and that the hardware-tuning provided by the underlying tree is sufficient to deliver as good or better performance than state-of-the-art approaches: the higher-level operations require no further hardware optimization.

Bibtex

@inproceedings{Kaestle:2016:MAB,
 author = {Kaestle, Stefan and Achermann, Reto and Haecki, Roni and Hoffmann, Moritz and Ramos, Sabela and Roscoe, Timothy},
 booktitle = {Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation},
 id = {Kaestle:2016:MAB},
 isbn = {978-1-931971-33-1},
 location = {Savannah, GA, USA},
 pages = {33--48},
 publisher = {USENIX Association},
 series = {OSDI'16},
 title = {Machine-aware Atomic Broadcast Trees for Multicores},
 url = {http://dl.acm.org/citation.cfm?id=3026877.3026881},
 year = {2016}
}

Contact

Prof. Reto Achermann
I01: Chair of Distributed Systems and Operating Systems (aka Systems Research Group)
1st Floor, 7th Finger
School of Computation, Information, and Technology (CIT)
Technical University of Munich (TUM)
Boltzmannstr. 3
85748 Garching bei München
Germany

firstname.lastname [at] cit.tum.de