Alfredo Buttari et al 2007 J. Phys.: Conf. Ser. 78 012028 doi:10.1088/1742-6596/78/1/012028
Alfredo Buttari1, Jack Dongarra1,2, Parry Husbands3, Jakub Kurzak1 and Katherine Yelick3,4
Show affiliationsPhysical constraints such as power, leakage and pin bandwidth are currently driving the HPC industry to produce systems with unprecedented levels of concurrency. In these parallel systems, synchronization and memory operations are becoming considerably more expensive than before. In this work we study parallel matrix factorization codes and conclude that they need to be re-engineered to avoid unnecessary (and expensive) synchronization. We propose the use of multithreading combined with intelligent schedulers and implement representative algorithms in this style. Our results indicate that this strategy can significantly outperform traditional codes.
07.05.Bx Computer systems: hardware, operating systems, computer languages, and utilities
Issue 1 (2007)
Alfredo Buttari et al 2007 J. Phys.: Conf. Ser. 78 012028
Akiko Hirai et al 2009 Metrologia 46 04005
Tomislav Prokopec and Ewald Puchwein JCAP04(2004)007
Odele Straub and Eva Šrámková 2009 Class. Quantum Grav. 26 055011
Abel Vargas et al 2008 Bioinspir. Biomim. 3 026004
Iosif Bena and Nicholas P. Warner JHEP12(2004)021
Sergei M. Kuzenko JHEP12(2007)010
G Ratel et al 2005 Metrologia 42 06016
J E McDonald et al 2006 JINST 1 P09003
Kengo Moribayashi 2008 J. Phys. B: At. Mol. Opt. Phys. 41 085602