This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy.
The following article is Open access

Optimization of Large Scale HEP Data Analysis in LHCb

, , , , , and

Published under licence by IOP Publishing Ltd
, , Citation Daniela Remenska et al 2011 J. Phys.: Conf. Ser. 331 072060 DOI 10.1088/1742-6596/331/7/072060

1742-6596/331/7/072060

Abstract

Observation has lead to a conclusion that the physics analysis jobs run by LHCb physicists on a local computing farm (i.e. non-grid) require more efficient access to the data which resides on the Grid. Our experiments have shown that the I/O bound nature of the analysis jobs in combination with the latency due to the remote access protocols (e.g. rfio, dcap) cause a low CPU efficiency of these jobs. In addition to causing a low CPU efficiency, the remote access protocols give rise to high overhead (in terms of amount of data transferred). This paper gives an overview of the concept of pre-fetching and caching of input files in the proximity of the processing resources, which is exploited to cope with the I/O bound analysis jobs. The files are copied from Grid storage elements (using GridFTP), while concurrently performing computations, inspired from a similar idea used in the ATLAS experiment. The results illustrate that this file staging approach is relatively insensitive to the original location of the data, and a significant improvement can be achieved in terms of the CPU efficiency of an analysis job. Dealing with scalability of such a solution on the Grid environment is discussed briefly.

Export citation and abstract BibTeX RIS

Please wait… references are loading.
10.1088/1742-6596/331/7/072060