Quick search Find article
Quick search
Find article

PERI - auto-tuning memory-intensive kernels for multicore

S Williams1,2, K Datta2, J Carter1, L Oliker1,2, J Shalf1, K Yelick1,2 and D Bailey1

Show affiliations


We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of search-based performance optimizations, popular in linear algebra and FFT libraries, to application-specific computational kernels. Our work applies this strategy to sparse matrix vector multiplication (SpMV), the explicit heat equation PDE on a regular grid (Stencil), and a lattice Boltzmann application (LBMHD). We explore one of the broadest sets of multicore architectures in the high-performance computing literature, including the Intel Xeon Clovertown, AMD Opteron Barcelona, Sun Victoria Falls, and the Sony-Toshiba-IBM (STI) Cell. Rather than hand-tuning each kernel for each system, we develop a code generator for each kernel that allows us identify a highly optimized version for each platform, while amortizing the human programming effort. Results show that our auto-tuned kernel applications often achieve a better than 4× improvement compared with the original code. Additionally, we analyze a Roofline performance model for each platform to reveal hardware bottlenecks and software challenges for future multicore systems and applications.


PACS

07.05.Kf Data analysis: algorithms and implementation; data management

02.30.Jr Partial differential equations

07.05.Rm Data presentation and visualization: algorithms and implementation

02.10.Yn Matrix theory

Subjects

Mathematical physics

Instrumentation and measurement

Dates

Issue 1 (2008)



  1. PERI - auto-tuning memory-intensive kernels for multicore

    S Williams et al 2008 J. Phys.: Conf. Ser. 125 012038

Users also read

What's this?
This innovative new feature generates a list of articles 'also read' by other users based on them reading the original article. Article abstracts citations and references are all considered and weighted accordingly. We hope that this will help you find relevant papers for your research.

  1. PERI auto-tuning
  2. Some essential techniques for developing efficient petascale applications
  3. Cyber-Enabled Scientific Discovery
More

Related review articles

What's this?
View review articles related to this research to gain an insight into the key trends in this subject area. Related review articles are selected based on PACS/MSC codes, and are no more than three years old.

  1. FPGA-based, specialized trigger and data acquisition systems for high-energy physics experiments

View by subject




Export






Please login to access our web services, or create an account if you don't yet have one.

You must have cookies enabled in your web browser to be able to login.

Username
Password

Forgotten your password? Get a new one here.