This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy.

Table of contents

Volume 119

2008

Previous issue Next issue

SOFTWARE COMPONENTS, TOOLS AND DATABASES

Accepted papers received: 23 June 2008
Published online: 31 July 2008

PAPERS

042001
The following article is Open access

, , and

Understanding modern particle accelerators requires simulating charged particle transport through the machine elements. These simulations can be very time consuming due to the large number of particles and the need to consider many turns of a circular machine. Stream computing offers an attractive way to dramatically improve the performance of such simulations by calculating the simultaneous transport of many particles using dedicated hardware. Modern Graphics Processing Units (GPUs) are powerful and affordable stream computing devices. The results of simulations of particle transport through the booster-to-storage-ring transfer line of the DIAMOND synchrotron light source using an NVidia GeForce 7900 GPU are compared to the standard transport code MAD. It is found that particle transport calculations are suitable for stream processing and large performance increases are possible. The accuracy and potential speed gains are compared and the prospects for future work in the area are discussed.

042002
The following article is Open access

and

The measurement of the muon energy deposition in the calorimeters is an integral part of muon identification, track isolation and correction for catastrophic muon energy losses, which are the prerequisites to the ultimate goal of refitting the muon track using calorimeter information as well. To this end, an accurate energy loss measurement method in the calorimeters is developed which uses only Event Data Model tools and is used by the muon isolation tool in the official ATLAS software, in order to provide isolation related variables at the Event Summary Data level. The strategy of the energy deposition measurement by the track in the calorimeters is described. Inner Detector, or Muon Spectrometer tracks are extrapolated to each calorimeter compartment using existing tools, which take into account multiple scattering and bending due to the magnetic field. The energy deposited in each compartment is measured by summing-up cells, corrected for noise, inside a cone of desired size around the track. The results of the measured energy loss in the calorimeters with this method are validated with Monte Carlo single muon samples.

042003
The following article is Open access

, , , , , , , , and

The size and complexity of LHC experiments raise unprecedented challenges not only in terms of detector design, construction and operation, but also in terms of software models and data persistency. One of the more challenging tasks is the calibration of the 375000 Monitored Drift Tubes, that will be used as precision tracking detectors in the Muon Spectrometer of the ATLAS experiment. A high rate of muon tracks is needed to reach the design average resolution of 80 microns. In this context, data suitable for MDT calibration will be extracted from the second level trigger and then streamed to three remote Tier-2 Calibration Centers. The Calibration sites will also need the ATLAS conditions data that are relevant for the calculation of MDT calibrations: either the appropriate tables of the Conditions Database will be replicated at the remote sites via ORACLE streams, or the remote sites will directly access these tables from the nearest Tier-1. At each centre, the computation of the actual calibration constants will be performed in several steps, including strict validation and data quality checks. All information produced at every stage of the calibration procedure will be stored in local ORACLE Calibration databases that will be replicated to a central database located at CERN using ORACLE streams: this will allow each Calibration site to access the data produced by the others and to eventually provide back-up should one site become unavailable for any reason. The validated calibration constants will be extracted from the CERN Calibration DB and stored into the ATLAS Conditions database for subsequent use in reconstruction and data analysis. This paper reviews the complex chain of databases envisaged to support the MDT Calibration and describes the actual status of the implementation and the tests that are being performed to ensure a smooth operation at the LHC start-up at the end of this year.

042004
The following article is Open access

, , , , and

People involved in modular projects need to improve the build software process, planning the correct execution order and detecting circular dependencies. The lack of suitable tools may cause delays in the development, deployment and maintenance of the software.

Experience in such projects has shown that the use of version control and build systems is not able to support the development of the software efficiently, due to a large number of errors each of which causes the breaking of the build process. Common causes of errors are for example the adoption of new libraries, libraries incompatibility, the extension of the current project in order to support new software modules.

In this paper, we describe a possible solution implemented in ETICS, an integrated infrastructure for the automated configuration, build and test of Grid and distributed software. ETICS has defined meta-data software abstractions, from which it is possible to download, build and test software projects, setting for instance dependencies, environment variables and properties. Furthermore, the meta-data information is managed by ETICS reflecting the version control system philosophy, because of the existence of a meta-data repository and the handling of a list of operations, such as check out and commit. All the information related to a specific software are stored in the repository only when they are considered to be correct.

By means of this solution, we introduce a sort of flexibility inside the ETICS system, allowing users to work accordingly to their needs. Moreover, by introducing this functionality, ETICS will be a version control system like for the management of the meta-data.

042005
The following article is Open access

, , , , , , , , , et al

Database replication is a key topic in the framework of the LHC Computing Grid to allow processing of data in a distributed environment. In particular, the LHCb computing model relies on the LHC File Catalog, i.e. a database which stores information about files spread across the GRID, their logical names and the physical locations of all the replicas. The LHCb computing model requires the LFC to be replicated at Tier-1s. The LCG 3D project deals with the database replication issue and provides a replication service based on Oracle Streams technology. This paper describes the deployment of the LHC File Catalog replication to the INFN National Center for Telematics and Informatics (CNAF) and to other LHCb Tier-1 sites. We performed stress tests designed to evaluate any delay in the propagation of the streams and the scalability of the system. The tests show the robustness of the replica implementation with performance going much beyond the LHCb requirements.

042006
The following article is Open access

, , , , , and

For the last several months the main focus of development in the ROOT I/O package has been code consolidation and performance improvements. We introduced a new pre-fetch mechanism to minimize the number of transactions between client and server, hence reducing the effect of latency on the time it takes to read a file both locally and over wide are network. We will review the implementation and how well it works in different conditions (gain of an order of magnitude for remote file access). We will also briefly describe new utilities, including a faster implementation of TTree cloning (gain of an order of magnitude), a generic mechanism for object references, and a new entry list mechanism tuned both for small and large number of selections. In addition to reducing the coupling with the core module and becoming its owns library (libRIO) (as part of the general restructuring of the ROOT libraries), the I/O package has been enhanced in the area of XML and SQL support, thread safety, schema evolution, tree queries, and many other areas.

042007
The following article is Open access

The ROOT [1] graphical framework provides support for many different functions including basic graphics, high-level visualization techniques, output on files, 3D viewing etc. They use well-known world standards to render graphics on screen, to produce high-quality output files, and to generate images for Web publishing. Many techniques allow visualization of all the basic ROOT data types, but the graphical framework was still a bit weak in the visualization of multiple variables data sets. This paper presents latest developments done in the ROOT framework to visualize multiple variables (>4) data sets.

042008
The following article is Open access

, , , , and

The ATLAS Tag Database is an event-level metadata system, designed to allow efficient identification and selection of interesting events for user analysis. By making first-level cuts using queries on a relational database, the size of an analysis input sample could be greatly reduced and thus the time taken for the analysis reduced. Deployment of such a Tag database is underway, but to be most useful it needs to be integrated with the distributed data management (DDM) and distributed analysis (DA) components. This means addressing the issue that the DDM system at ATLAS groups files into datasets for scalability and usability, whereas the Tag Database points to events in files. It also means setting up a system which could prepare a list of input events and use both the DDM and DA systems to run a set of jobs. The ATLAS Tag Navigator Tool (TNT) has been developed to address these issues in an integrated way and provide a tool that the average physicist can use. Here, the current status of this work is presented and areas of future work are highlighted.

042009
The following article is Open access

, and

Database demand resulting from offline analysis and production of data at the STAR experiment at Brookhaven National Laboratory's Relativistic Heavy-Ion Collider has steadily increased over the last six years of data taking activities. With each year, STAR more than doubles the number of events recorded with an anticipation of reaching a billion event capabilities as early as next year. The challenges faced from producing and analyzing this magnitude of events in parallel have raised issues with regard to the distribution of calibrations and geometry data, via databases, to STAR's growing global collaboration. Rapid distribution, availability, ensured synchronization and load balancing have become paramount considerations. Both conventional technology and novel approaches are used in parallel to realize these goals. This paper discusses how STAR uses load balancing to optimize database usage. It discusses distribution methods via MySQL master slave replication; the synchronization issues that arise from this type of distribution and solutions, mostly homegrown, put forth to overcome these issues. A novel approach toward load balancing between slave nodes that assists in maintaining a high availability rate for a veracious community is discussed in detail. This load balancing addresses both, pools of nodes internal to a given location, as well as balancing the load for remote users between different available locations. Challenges, trade-offs, rationale for decisions and paths forward will be discussed in all cases, presenting a solid production environment with a vision for scalable growth.

042010
The following article is Open access

, , , and

The ETICS system is a distributed software configuration, build and test system designed to fulfil the needs of improving the quality, reliability and interoperability of distributed software in general and grid software in particular. The ETICS project is a consortium of five partners (CERN, INFN, Engineering Ingegneria Informatica, 4D Soft and the University of Wisconsin-Madison). The ETICS service consists of a build and test job execution system based on the Metronome software and an integrated set of web services and software engineering tools to design, maintain and control build and test scenarios. The ETICS system allows taking into account complex dependencies among applications and middleware components and provides a rich environment to perform static and dynamic analysis of the software and execute deployment, system and interoperability tests. This paper gives an overview of the system architecture and functionality set and then describes how the EC-funded EGEE, DILIGENT and OMII-Europe projects are using the software engineering services to build, validate and distribute their software. Finally a number of significant use and test cases will be described to show how ETICS can be used in particular to perform interoperability tests of grid middleware using the grid itself.

042011
The following article is Open access

, and

We discuss the rapid development of a large scale data discovery service for the CMS experiment using modern AJAX techniques and the Python language. To implement a flexible interface capable of accommodating several different versions of the DBS database, we used a 'stack' approach. Asynchronous JavaScript and XML (AJAX) together with an SQL abstraction layer, template engine, code generation tool and dynamic queries provide powerful tools for constructing interactive interfaces to large amounts of data. We show how the use of these tools, with rapid development in a modern scripting language, improved the scalability and usability of the the search interface for different user communities.

042012
The following article is Open access

As we near the collection of the first data from the Large Hadron Collider, the ATLAS collaboration is preparing the software and computing infrastructure to allow quick analysis of the first data and support of the long-term steady-state ATLAS physics program. As part of this effort considerable attention has been paid to the 'Analysis Model', a vision of the interplay of the software design, computing constraints, and various physics requirements. An important input to this activity has been the experience of Tevatron and B-Factory experiments, one topic which was explored and discussed in the ATLAS October 2006 Analysis Model workshop. Recently, much of the Analysis Model has focused on ensuring the ATLAS software framework supports the required manipulations of event data; the event data design and content is consistent with foreseen calibration and physics analysis tasks; the event data is optimized in size, access speed, and is accessible both inside and outside the software framework; and that the analysis software may be developed collaboratively.

042013
The following article is Open access

and

The ATLAS Tile Calorimeter detector is presently involved in an intense phase of subsystems integration and commissioning with muons of cosmic origin. Various monitoring programs have been developed at different levels of the data flow to tune the set-up of the detector running conditions and to provide a fast and reliable assessment of the data quality already during data taking. This paper focuses on the monitoring system integrated in the highest level of the ATLAS trigger system, the Event Filter, and its deployment during the Tile Calorimeter commissioning with cosmic ray muons. The key feature of Event Filter monitoring is the capability of performing detector and data quality control on complete physics events at the trigger level, hence before events are stored on disk. In ATLAS' online data flow, this is the only monitoring system capable of giving a comprehensive event quality feedback.

042014
The following article is Open access

and

The ROOT geometry modeller (TGeo) offers powerful tools for detector geometry description. The package provides several functionalities like: navigation, geometry checking, enhanced visualization, geometry editing GUI and many others, using ROOT I/O. A new interface module g4root was recently developed to take advantage of ROOT geometry navigation optimizations in the context of GEANT4 simulation. The interface can be used either by native GEANT4-based simulation applications or in the more general context of the Virtual Monte Carlo (VMC) framework developed by ALICE offline and ROOT teams. The latter allows running GEANT3, GEANT4 and FLUKA simulations without changing either the geometry description or the user code. The interface was tested and stressed in the context of ALICE simulation framework. A description of the interface, its usage as well as recent results in terms of reliability and performance will be presented. Some benchmarks will be compared for ROOT-TGeo or GEANT4 based navigation.

042015
The following article is Open access

and

The Database and Engineering Services Group of CERN's Information Technology Department supplies the Oracle Central Database services used in many activities at CERN. In order to provide High Availability and ease management for those services, a NAS (Network Attached Storage) based infrastructure has been setup. It runs several instances of the Oracle RAC (Real Application Cluster) using NFS (Network File System) as shared disk space for RAC purposes and Data hosting. It is composed of two private LANs (Local Area Network), one to provide access to the NAS filers and a second to implement the Oracle RAC private interconnect, both using Network Bonding. NAS filers are configured in partnership to prevent having single points of failure and to provide automatic NAS filer fail-over.

042016
The following article is Open access

and

The Virtual Geometry Model (VGM) was introduced at CHEP in 2004 [1], where its concept, based on the abstract interfaces to geometry objects, has been presented. Since then, it has undergone a design evolution to pure abstract interfaces, it has been consolidated and completed with more advanced features. Currently it is used in Geant4 VMC for the support of TGeo geometry definition with Geant4 native geometry navigation and recently it has been used in the validation of the G4Root tool.

The implementation of the VGM for a concrete geometry model represents a small layer between the VGM and the particular native geometry. In addition to the implementations for Geant4 and Root TGeo geometry models, there is now added the third one for AGDD, which together with the existing XML exporter makes the VGM the most advanced tool for exchanging geometry formats providing 9 ways of conversions between Geant4, TGeo, AGDD and GDML models.

In this presentation we will give the overview and the present status of the tool, we will review the supported features and point to possible limits in converting geometry models.

042017
The following article is Open access

, and

This paper describes the software component, perfmon2, that is about to be added to the Linux kernel as the standard interface to the Performance Monitoring Unit (PMU) on common processors, including x86 (AMD and Intel), Sun SPARC, MIPS, IBM Power and Intel Itanium. It also describes a set of tools for doing performance monitoring in practice and details how the CERN openlab team has participated in the testing and development of these tools.

042018
The following article is Open access

The C++ reconstruction framework JANA has been written to support the next generation of Nuclear Physics experiments at Jefferson Lab in anticipation of the 12GeV upgrade. The JANA framework was designed to allow multi-threaded event processing with a minimal impact on developers of reconstruction software. As we enter the multi-core (and soon many-core) era, thread-enabled code will become essential to utilizing the full processor power available without invoking the logistical overhead of managing many individual processes. Event-based reconstruction lends itself naturally to multi-threaded processing. Emphasis will be placed on the multi-threading features of the framework. Test results of the scaling of event processing rates with number of threads are presented.

042019
The following article is Open access

, and

Goodness-of-fit statistics measure the compatibility of random samples against some theoretical or reference probability distribution function. The classical one-dimensional Kolmogorov-Smirnov test is a non-parametric statistic for comparing two empirical distributions which defines the largest absolute difference between the two cumulative distribution functions as a measure of disagreement. Adapting this test to more than one dimension is a challenge because there are 2d-1 independent ways of ordering a cumulative distribution function in d dimensions. We discuss Peacock's version of the Kolmogorov-Smirnov test for two-dimensional data sets which computes the differences between cumulative distribution functions in 4n2 quadrants. We also examine Fasano and Franceschini's variation of Peacock's test, Cooke's algorithm for Peacock's test, and ROOT's version of the two-dimensional Kolmogorov-Smirnov test. We establish a lower-bound limit on the work for computing Peacock's test of Ω(n2lgn), introducing optimal algorithms for both this and Fasano and Franceschini's test, and show that Cooke's algorithm is not a faithful implementation of Peacock's test. We also discuss and evaluate parallel algorithms for Peacock's test.

042020
The following article is Open access

, , and

During the construction and commissioning phases of the ATLAS detector, data related to the installation, placement, testing and performance of the equipment are stored in relational databases. Each group acquires and saves information in different servers, using diverse technologies, data modeling and terminologies. Installation and maintenance during the experiment construction and operation depends on the access to this information, as well as imply in its update. The development of retrieval and update systems for each data set requires too much effort and high maintenance cost. The Glance system retrieves and inserts/updates data independently of the modeling and technology used for the storage, recognizes the repositories internal structure and guides the user through the creation of search and insertion interfaces. Distinct and spread data sets can be transparently integrated in one interface. Data can be exported/imported to/from various formats. The system handles many independent interfaces, which can be accessed by users or other applications at any time. This paper describes the Glance conception, its development and features. The system usage is illustrated with examples. Current status and future work are also discussed.

042021
The following article is Open access

, , , , , , and

This article describes the set of computer systems that support the data analysis and quality control during the Tile Calorimeter commissioning phase. The Tile Commissioning Web System (TCWS) encapsulates the steps to retrieve information, execute programs, access the outcomes, register statements and verify the equipment status. TCWS integrates different applications, each one presenting a particular view of the commissioning process. The TileComm Analysis stores plots and analysis results, provides equipment-oriented visualization, collects information regarding the equipment performance, and outlines its status in each test. The Timeline application provides the equipment status history in a chronological way. The Web Interface for Shifters supports monitoring tasks by managing test parameters, graphical views of the detector's performance, and information status of all equipment that was used in each test. The DCS Web System provides a standard way to verify the behaviour of power sources and the cooling system.

042022
The following article is Open access

, , and

In the ATLAS event store, files are sometimes 'an inconvenient truth.' From the point of view of the ATLAS distributed data management system, files are too small—datasets are the units of interest. From the point of view of the ATLAS event store architecture, files are simply a physical clustering optimization: the units of interest are event collections—sets of events that satisfy common conditions or selection predicates—and such collections may or may not have been accumulated into files that contain those events and no others. It is nonetheless important to maintain file-level metadata, and to cache metadata in event data files. When such metadata may or may not be present in files, or when values may have been updated after files are written and replicated, a clear and transparent model for metadata retrieval from the file itself or from remote databases is required. In this paper we describe how ATLAS reconciles its file and non-file paradigms, the machinery for associating metadata with files and event collections, and the infrastructure for metadata propagation from input to output for provenance record management and related purposes.

042023
The following article is Open access

, and

Advanced mathematical and statistical computational methods are required by the LHC experiments to analyzed their data. These methods are provided by the Math work package of the ROOT project. An overview of the recent developments of this work package is presented by describing the restructuring of the core mathematical library in a coherent set of C++ classes and interfaces. The achieved improvements, in terms of performances and quality, of numerical methods present in ROOT are shown as well. New developments in the fitting and minimization packages are reviewed. A new graphics interface has been developed to drive the fitting process and new classes are being introduced to extend the fitting functionality. Furthermore, recent and planned developments of integrating in the ROOT environment new advanced statistical tools required for the analysis of the LHC data are presented.

042024
The following article is Open access

, , , , , and

In anticipation of data taking, ATLAS has undertaken a program of work to develop an explicit state representation of the experiment's complex transient event data model. This effort has provided both an opportunity to consider explicitly the structure, organization, and content of the ATLAS persistent event store before writing tens of petabytes of data (replacing simple streaming, which uses the persistent store as a core dump of transient memory), and a locus for support of event data model evolution, including significant refactoring, beyond the automatic schema evolution capabilities of underlying persistence technologies. ATLAS has encountered the need for such non-trivial schema evolution on several occasions already.

This paper describes the state representation strategy (transient/persistent separation) and its implementation, including both the payoffs that ATLAS has seen (significant and sometimes surprising space and performance improvements, the extra layer notwithstanding, and extremely general schema evolution support) and the costs (additional and relatively pervasive additional infrastructure development and maintenance). The paper further discusses how these costs are mitigated, and how ATLAS is able to implement this strategy without losing the ability to take advantage of the (improving!) automatic schema evolution capabilities of underlying technology layers when appropriate.

Implications of state representations for direct ROOT browsability, and current strategies for associating physics analysis views with such state representations, are also described.

042025
The following article is Open access

and

CERN uses Platforms Load Sharing Facility (LSF)[1] since 1998 to manage the large batch system installations. Since that time, the farm has increased significantly, and commodity based hardware running GNU/Linux has replaced other Unix flavors on specialized hardware. In this paper we will present how the system is set up nowadays. We will briefly report on issues seen in the past, and actions which have been taken to resolve them. In this context the status of the evaluation of the most recent version of this product, LSF 7.0, is presented, and the planned migration scenario is described.

042026
The following article is Open access

, , , , , , and

The ATLAS conditions databases will be used to manage information of quite diverse nature and level of complexity. The usage of a relational database manager like Oracle, together with the object managers POOL and OKS developed in-house, poses special difficulties in browsing the available data while understanding its structure in a general way. This is particularly relevant for the database browser projects where it is difficult to link with the class defining libraries generated by general frameworks such as Athena. A modular approach to tackle these problems is presented here.

The database infrastructure is under development using the LCG COOL infrastructure, and provides a powerful information sharing gateway upon many different systems. The nature of the stored information ranges from temporal series of simple values up to very complex objects describing the configuration of systems like ATLAS' TDAQ infrastructure, including also associations to large objects managed outside of the database infrastructure.

An important example of this architecture is the Online Objects Extended Database BrowsEr (NODE), which is designed to access and display all data, available in the ATLAS Monitoring Data Archive (MDA), including histograms and data tables. To deal with the special nature of the monitoring objects, a plugin from the MDA framework to the Time managed science Instrument Databases (TIDB2) is used. The database browser is extended, in particular to include operations on histograms such as display, overlap, comparisons as well as commenting and local storage.

042027
The following article is Open access

, and

In our previous work, project PHEASANT, we have proposed a Domain Specific Language (DSL) to provide the HEP community with a tool that could increase user's productivity. This tool would tackle the problem while producing query code for HEP physics data analysis. The first step of this project was to design and implement a solution that would be a proof of concept. We are now concentrated on implementation issues in order to deploy a final tool (i.e. consistent, robust, etc).

The concept of domain specific languages has always been implicit in Software Engineering projects although the development of such languages was never done in a very systematic way.

The main goal of having DSLs is to rise the level of abstraction. The idea is to provide the final user (stakeholder) with tools to reason and model the solution by using concepts of the problem domain instead of having to reason with concepts of the problem domain (meaning the implementation details like programming concepts and hardware restrictions). Once we have the model specified, we can use Model Driven Development and Software Product Lines techniques to deploy artifacts in a automatic way (meaning: software products, code, documentation etc). The Software Engineering community has been focusing lately its attention to methodologies and tool deployment in order to help DSL developers in their effort to help productivity and efficency at several application domains such as HEP. A comparative study of these tools should be done to determine their capability to answer the specificity of HEP Physics Analysis requirements.

In this communication we will present the several technologies for DSLs meta-modeling studied in order to implement the DSL proposed by the PHEASANT project.

042028
The following article is Open access

OpenGL has been promoted to become the main 3D rendering engine of the ROOT framework. This required a major re-modularization of OpenGL support on all levels, from basic window-system specific interface to medium-level object-representation and top-level scene management. This new architecture allows seamless integration of external scene-graph libraries into the ROOT OpenGL viewer as well as inclusion of ROOT 3D scenes into external GUI and OpenGL-based 3D-rendering frameworks.

Scene representation was removed from inside of the viewer, allowing scene-data to be shared among several viewers and providing for a natural implementation of multi-view canvas layouts. The object-graph traversal infrastructure allows free mixing of 3D and 2D-pad graphics and makes implementation of ROOT canvas in pure OpenGL possible. Scene-elements representing ROOT objects trigger automatic instantiation of user-provided rendering-objects based on the dictionary information and class-naming convention. Additionally, a finer, per-object control over scene-updates is available to the user, allowing overhead-free maintenance of dynamic 3D scenes and creation of complex real-time animations. User-input handling was modularized as well, making it easy to support application-specific scene navigation, selection handling and tool management.

042029
The following article is Open access

, , , and

CERN, the European Laboratory for Particle Physics, located in Geneva -Switzerland, is currently building the LHC (Large Hadron Collider), a 27 km particle accelerator. The equipment life-cycle management of this project is provided by the Engineering and Equipment Data Management System (EDMS [1] [2]) Service. Using an Oracle database, it supports the management and follow-up of different kinds of documentation through the whole life cycle of the LHC project: design, manufacturing, installation, commissioning data etc... The equipment data collection phase is now slowing down and the project is getting closer to the 'As-Built' phase: the phase of the project consuming and exploring the large volumes of data stored since 1996. Searching through millions of items of information (documents, equipment parts, operations...) multiplied by dozens of points of view (operators, maintainers...) requires an efficient and flexible search engine. This paper describes the process followed by the team to implement the search engine for the LHC As-built project in the EDMS Service. The emphasis is put on the design decision to decouple the search engine from any user interface, potentially enabling other systems to also use it. Projections, algorithms, and the planned implementation are described in this paper. The implementation of the first version started in early 2007.

042030
The following article is Open access

, and

The CMS experiment at LHC has a very large body of software of its own and uses extensively software from outside the experiment. Understanding the performance of such a complex system is a very challenging task, not the least because there are extremely few developer tools capable of profiling software systems of this scale, or producing useful reports.

CMS has mainly used IgProf, valgrind, callgrind and OProfile for analysing the performance and memory usage patterns of our software. We describe the challenges, at times rather extreme ones, faced as we've analysed the performance of our software and how we've developed an understanding of the performance features. We outline the key lessons learnt so far and the actions taken to make improvements. We describe why an in-house general profiler tool still ends up besting a number of renowned open-source tools, and the improvements we've made to it in the recent year.

042031
The following article is Open access

The size and complexity of LHC experiments raise unprecedented challenges not only in terms of detector design, construction and operation, but also in terms of software models and data access and storage. The nominal interaction rate of about 1 GHz at the design luminosity of 1034 cm-2 s-1 must be reduced online by about seven orders of magnitude to an event rate of O(100) Hz going to mass storage, and consisting of several different streams, one of them entirely dedicated to the calibration. One of the most challenging tasks will be the storage of non-event data produced by calibration and alignment stream processes into the Conditions Database at the Tier0 (located at CERN). In this work, the ATLAS Calibration Streams and the Conditions Database will be described.

042032
The following article is Open access

, and

The ATLAS experiment at LHC will make extensive use of relational databases in both online and offline contexts, running to O(TBytes) per year. Two of the most challenging applications in terms of data volume and access patterns are conditions data, making use of the LHC conditions database, COOL, and the TAG database, that stores summary event quantities allowing a rapid selection of interesting events. Both of these databases are being replicated to regional computing centres using Oracle Streams technology, in collaboration with the LCG 3D project. Database optimisation, performance tests and first user experience with these applications will be described, together with plans for first LHC data-taking and future prospects.

042033
The following article is Open access

, and

A tool is presented that is capable of reading from, writing to and converting between various sources. Currently supported file formats are ROOT, HBOOK, HDF, XML, SQLITE and a few text file formats. A plugin mechanism decouples the file-format specific 'backends' from the main library. All data are internally represented as 'heterogeneous hierarchic tuples' no other data structure exists in the DataHarvester.

042034
The following article is Open access

Assessing the quality of data recorded with the ATLAS detector is crucial for commissioning and operating the detector to achieve sound physics measurements. In particular, the fast assessment of complex quantities obtained during event reconstruction and the ability to easily track them over time are especially important given the large data throughput and the distributed nature of the analysis environment. The data are processed once on a computer farm comprising O(1, 000) nodes before being distributed on the Grid, and reliable, centralized methods must be used to organize, merge, present, and archive data-quality metrics for performance experts and analysts. A review of the tools and approaches employed by the detector and physics groups in this environment and a summary of their performances during commissioning are presented.