Table of contents

Volume 219

2010

Previous issue Next issue

Online Computing

Accepted papers received: 24 March 2010
Published online: 07 May 2010

Contributed

022001
The following article is Open access

, , , , , , , , , et al

The Silicon Vertex Trigger (SVT) is a processor developed at CDF experiment to perform online fast and precise track reconstruction. SVT is made of two pipelined processors, the Associative Memory, finding low precision tracks, and the Track Fitter, refining the track quality whith high precision fits. We will describe the architecture and the performances of a next generation track fitter, the GigaFitter, developed to reduce the degradation of the SVT efficiency due to the increasing instantaneous luminosity. The GigaFitter reduces the track parameter reconstruction to a few clock cycles and can perform many fits in parallel, thus allowing high resolution tracking at very high rate. The core of the GigaFitter is implemented in a modern Xilinx Virtex-5 FPGA chip, rich in powerful DSP arrays. The FPGA is housed on a mezzanine board which receives the data from a subset of the tracking detector and transfers the fitted tracks to a Pulsar motherboard for the final corrections. Instead of the current 12 boards, one per sector of the detector, the final system will be much more compact, consisting of a single GigaFitter Pulsar board equipped with four mezzanine cards receiving the data from the entire tracking detector. Moreover, the GigaFitter modular structure is adequate to scale for much better performances and is general enough to be easily adapted to future High Energy Physics (HEP) experiments and applications outside HEP.

022002
The following article is Open access

, , , , , , , , , et al

The CMS online cluster consists of more than 2000 computers running about 10000 application instances. These applications implement the control of the experiment, the event building, the high level trigger, the online database and the control of the buffering and transferring of data to the Central Data Recording at CERN. In this paper the IT solutions employed to fulfil the requirements of such a large cluster are revised. Details are given on the chosen network structure, configuration management system, monitoring infrastructure and on the implementation of the high availability for the services and infrastructure.

022003
The following article is Open access

, , , , , , , , , et al

The CMS Data Acquisition cluster, which runs around 10000 applications, is configured dynamically at run time. XML configuration documents determine what applications are executed on each node and over what networks these applications communicate. Through this mechanism the DAQ System may be adapted to the required performance, partitioned in order to perform (test-) runs in parallel, or re-structured in case of hardware faults. This paper presents the configuration procedure and the CMS DAQ Configurator tool, which is used to generate comprehensive configurations of the CMS DAQ system based on a high-level description given by the user. Using a database of configuration templates and a database containing a detailed model of hardware modules, data and control links, nodes and the network topology, the tool automatically determines which applications are needed, on which nodes they should run, and over which networks the event traffic will flow. The tool computes application parameters and generates the XML configuration documents and the configuration of the run-control system. The performance of the configuration procedure and the tool as well as operational experience during CMS commissioning and the first LHC runs are discussed.

022004
The following article is Open access

, , , , , , , , , et al

ALICE (A Large Ion Collider Experiment) is the heavy-ion detector designed to study the physics of strongly interacting matter and the quark-gluon plasma at the CERN Large Hadron Collider (LHC). Some specific calibration tasks are performed regularly for each of the 18 ALICE sub-detectors in order to achieve most accurate physics measurements. These procedures involve events analysis in a wide range of experimental conditions, implicating various trigger types, data throughputs, electronics settings, and algorithms, both during short sub-detector standalone runs and long global physics runs. A framework was designed to collect statistics and compute some of the calibration parameters directly online, using resources of the Data Acquisition System (DAQ), and benefiting from its inherent parallel architecture to process events. This system has been used at the experimental area for one year, and includes more than 30 calibration routines in production. This paper describes the framework architecture and the synchronization mechanisms involved at the level of the Experiment Control System (ECS) of ALICE. The software libraries interfacing detector algorithms (DA) to the online data flow, configuration database, experiment logbook, and offline system are reviewed. The test protocols followed to integrate and validate each sub-detector component are also discussed, including the automatic build system and validation procedures used to ensure a smooth deployment. The offline post-processing and archiving of the DA results is covered in a separate paper.

022005
The following article is Open access

and

ATLAS is one of the four experiments in the Large Hadron Collider (LHC) at CERN, which has been put in operation this year. The challenging experimental environment and the extreme detector complexity required development of a highly scalable distributed monitoring framework, which is currently being used to monitor the quality of the data being taken as well as operational conditions of the hardware and software elements of the detector, trigger and data acquisition systems. At the moment the ATLAS Trigger/DAQ system is distributed over more than 1000 computers, which is about one third of the final ATLAS size. At every minute of an ATLAS data taking session the monitoring framework serves several thousands physics events to monitoring data analysis applications, handles more than 4 million histograms updates coming from more than 4 thousands applications, executes 10 thousands advanced data quality checks for a subset of those histograms, displays histograms and results of these checks on several dozens of monitors installed in main and satellite ATLAS control rooms. This note presents the overview of the online monitoring software framework, and describes the experience, which was gained during an extensive commissioning period as well as at the first phase of LHC beam in September 2008. Performance results, obtained on the current ATLAS DAQ system will also be presented, showing that the performance of the framework is adequate for the final ATLAS system.

022006
The following article is Open access

and

ATLAS is one of the two general-purpose detectors at the Large Hadron Collider (LHC). The trigger system is responsible for making the online selection of interesting collision events. At the LHC design luminosity of 1034 cm−2s−1 it will need to achieve a rejection factor of the order of 10−7 against random proton-proton interactions, while selecting with high efficiency events that are needed for physics analyses. After a first processing level using custom electronics based on FPGAs and ASICs, the trigger selection is made by software running on two processor farms, containing a total of around two thousand multi-core machines. This system is known as the High Level Trigger (HLT). To reduce the network data traffic and the processing time to manageable levels, the HLT uses seeded, step-wise reconstruction, aiming at the earliest possible rejection of background events. The recent LHC startup and short single-beam run provided a "stress test" of the system and some initial calibration data. Following this period, ATLAS continued to collect cosmic-ray events for detector alignment and calibration purposes. After giving an overview of the trigger design and its innovative features, this paper focuses on the experience gained from operating the ATLAS trigger with single LHC beams and cosmic-rays.

022007
The following article is Open access

, , and

The Data Acquisition Backbone Core (DABC) is a general purpose software framework designed for the implementation of a wide-range of data acquisition systems – from various small detector test beds to high performance systems. DABC consists of a compact data-flow kernel and a number of plug-ins for various functional components like data inputs, device drivers, user functional modules and applications. DABC provides configurable components for implementing event building over fast networks like InfiniBand or Gigabit Ethernet. A generic Java GUI provides the dynamic control and visualization of control parameters and commands, provided by DIM servers. A first set of application plug-ins has been implemented to use DABC as event builder for the front-end components of the GSI standard DAQ system MBS (Multi Branch System). Another application covers the connection to DAQ readout chains from detector front-end boards (N-XYTER) linked to read-out controller boards (ROC) over UDP into DABC for event building, archiving and data serving. This was applied for data taking in the September 2008 test beamtime for the CBM experiment at GSI. DABC version 1.0 is released and available from the website.

022008
The following article is Open access

, , , , and

The High Level Trigger (HLT) and Data Acquisition (DAQ) system selects about 2 kHz of events out of the 40 MHz of beam crossings. The selected events are consolidated into files on an onsite storage and then sent to permanent storage for subsequent analysis on the Grid. For local and full-chain tests a method to exercise the data-flow through the High Level Trigger when there are no actual data is needed. In order to test the system as much as possible under identical conditions as for data-taking the solution is to inject data at the input of the HLT at a minimum rate of 2 kHz. This is done via a software implementation of the trigger system which sends data to the HLT. The application has to simulate that the data it sends come from real LHCb readout-boards. Both simulation data and previously recorded real data can be re-played through the system in this manner. As the data rate is high (100 MB/s), care has been taken to optimise the emulator for throughput from the Storage Area Network (SAN). The emulator can be run in stand-alone mode or run as a pseudo-subdetector of LHCb, allowing for use of all the standard run-control tools. The architecture, implementation and performance of the emulator will be presented.

022009
The following article is Open access

, , , , , , , , , et al

LHCb has designed and implemented an integrated Experiment Control System. The Control System uses the same concepts and the same tools to control and monitor all parts of the experiment: the Data Acquisition System, the Timing and the Trigger Systems, the High Level Trigger Farm, the Detector Control System, the Experiment's Infrastructure and the interaction with the CERN Technical Services and the Accelerator. LHCb's Run Control, the main interface used by the experiment's operator, provides access in a hierarchical, coherent and homogeneous manner to all areas of the experiment and to all its sub-detectors. It allows for automated (or manual) configuration and control, including error recovery, of the full experiment in its different running modes. Different instances of the same Run Control interface are used by the various sub-detectors for their stand-alone activities: test runs, calibration runs, etc. The architecture and the tools used to build the control system, the guidelines and components provided to the developers, as well as the first experience with the usage of the Run Control will be presented

022010
The following article is Open access

, , , and

A Large Ion Collider Experiment (ALICE) is the dedicated heavy-ion experiment at the CERN LHC and will take data with a bandwidth of up to 1.25 GB/s. It consists of 18 subdetectors that interact with five online systems (CTP, DAQ, DCS, ECS, and HLT). Data recorded is read out by DAQ in a raw data stream produced by the subdetectors. In addition the subdetectors produce conditions data derived from the raw data, i.e. calibration and alignment information, which have to be available from the beginning of the reconstruction and therefore cannot be included in the raw data. The extraction of the conditions data is steered by a system called Shuttle. It provides the link between data produced by the subdetectors in the online systems and a dedicated procedure per subdetector, called preprocessor, that runs in the Shuttle system. The preprocessor performs merging, consolidation, and reformatting of the data. Finally, it stores the data in the Grid Offline Conditions DataBase (OCDB) so that they are available for the Offline reconstruction. The reconstruction of a given run is initiated automatically once the raw data is successfully exported to the Grid storage and the run has been processed in the Shuttle framework. These proceedings introduce the Shuttle system. The performance of the system during the ALICE cosmics commissioning and LHC startup is described.

022011
The following article is Open access

, , , , , , , , , et al

The CMS data acquisition system is made of two major subsystems: event building and event filter. The presented paper describes the architecture and design of the software that processes the data flow in the currently operating experiment. The central DAQ system relies on industry standard networks and processing equipment. Adopting a single software infrastructure in all subsystems of the experiment imposes, however, a number of different requirements. High efficiency and configuration flexibility are among the most important ones. The XDAQ software infrastructure has matured over an eight years development and testing period and has shown to be able to cope well with the requirements of the CMS experiment.

022012
The following article is Open access

, , , , and

The real time data analysis at next generation experiments is a challenge because of their enormous data rate and size. The SuperKEKB experiment, the upgraded Belle experiment, requires to process 100 times larger data of current one. The offline-level data analysis is necessary in the HLT farm for the efficient data reduction. The real time processing of huge data is also the key at the planned dark energy survey using the Subaru telescope. The main camera for the survey called Hyper Suprime-Cam consists of 100 CCDs with 8 mega pixels each, and the total data size is expected to become comparable with that of SuperKEKB. The online tuning of measurement parameters is being planned by the real time processing, which was done empirically in the past. We started a joint development of the real time framework to be shared both by SuperKEKB and Hyper Suprime-Cam. The parallel processing technique is widely adopted in the framework design to utilize a huge number of network-connected PCs with multi-core CPUs. The parallel processing is performed not only in the trivial event-by-event manner, but also in the pipeline of the software modules which are dynamically placed over the distributed computing nodes. The object data flow in the framework is realized by the object serializing technique with the object persistency. On-the-fly collection of histograms and N-tuples is supported for the run-time monitoring. The detailed design and the development status of the framework is presented.

022013
The following article is Open access

, , , , , , , , , et al

The CMS detector at LHC is equipped with a high precision lead tungstate crystal electromagnetic calorimeter (ECAL). The front-end boards and the photodetectors are monitored using a network of DCU (Detector Control Unit) chips located on the detector electronics. The DCU data are accessible through token rings controlled by an XDAQ-based software component. Relevant parameters are transferred to DCS (Detector Control System) and stored into the Condition DataBase. The operational experience from the ECAL commissioning at the CMS experimental cavern is discussed and summarized.

022014
The following article is Open access

The alignment of the muon system of CMS is performed using different techniques: photogrammetry measurements, optical alignment and alignment with tracks. For track-based alignment, several methods are employed, ranging from a hit and impact point (HIP) algorithm and a procedure exploiting chamber overlaps to a global fit method based on the Millepede approach. For start-up alignment as long as available integrated luminosity is still significantly limiting the size of the muon sample from collisions, cosmic muon and beam halo signatures play a very strong role. During the last commissioning runs in 2008 the first aligned geometries have been produced and validated with data. The CMS offline computing infrastructure has been used in order to perform improved reconstructions. We present the computational aspects related to the calculation of alignment constants at the CERN Analysis Facility (CAF), the production and population of databases and the validation and performance in the official reconstruction. Also the integration of track-based and other sources of alignment is discussed.

022015
The following article is Open access

and

The ATLAS hadronic Tile Calorimeter is ready for data taking with proton-proton collisions provided by the Large Hadron Collider (LHC). The Tile Calorimeter is a sampling calorimeter with iron absorbers and scintillators as active medium. The scintillators are read out by wave length shifting fibers (WLS) and photomultipliers (PMTs). The LHC provides collisions every 25 ns, putting very stringent requirements on the synchronization of the ATLAS triggering systems and the read out of the on-detector electronics. More than 99% of the read out channels of the Tile Calorimeter have been time calibrated using laser pulses sent directly to the PMTs. Timing calibration constants can be calculated after corrections for differences in laser light paths to the different parts of the calorimeter. The calibration consists of two parts: programmable corrections implemented in the on-detector electronics, and residual deviations from perfect timing stored in a database used during the offline reconstruction of the Tile Calorimeter data. Data taken during long ATLAS cosmic runs and during LHC beam time in September 2008 has confirmed a timing uniformity of 2 ns in each of the four calorimeter sections. The remaining offsets between the four calorimeter sections, have been measured in two ways. First by using the laser pulses interleaved with cosmic triggers inside a global ATLAS run. The second method uses the real LHC events acquired during the 2008 beam time. Both methods give consistent results. The main limitations on the precision of the time calibration are presented.

022016
The following article is Open access

, , , , , , , , , et al

In this paper we give a description of the database services for the control and monitoring of the electromagnetic calorimeter of the CMS experiment at LHC. After a general description of the software infrastructure, we present the organization of the tables in the database, that has been designed in order to simplify the development of software interfaces. This feature is achieved including in the database the description of each relevant table. We also give some estimation about the final size and performance of the system.

022017
The following article is Open access

and

The ATLAS Level-1 Central Trigger (L1CT) system is a central part of ATLAS data-taking. It receives the 40 MHz bunch clock from the LHC machine and distributes it to all sub-detectors. It initiates the detector read-out by forming the Level-1 Accept decision, which is based on information from the calorimeter and muon trigger processors, plus a variety of additional trigger inputs from detectors in the forward regions. The L1CT also provides trigger-summary information to the data acquisition and the Level-2 trigger systems for use in higher levels of the selection process, in offline analysis, and for monitoring.

In this paper we give an overview of the operational framework of the L1CT with particular emphasis on cross-system aspects. The software framework allows a consistent configuration with respect to the LHC machine, upstream and downstream trigger processors, and the data acquisition. Trigger and dead-time rates are monitored coherently on all stages of processing and are logged by the online computing system for physics analysis, data quality assurance and operational debugging. In addition, the synchronisation of trigger inputs is watched based on bunch-by-bunch trigger information. Several software tools allow to efficiently display the relevant information in the control room in a way useful for shifters and experts. We present the overall performance during cosmic-ray data taking with the full ATLAS detector and the experience with first beam in the LHC.

022018
The following article is Open access

and

The ATLAS experiment uses a complex trigger strategy to be able to achieve the necessary Event Filter rate output, making possible to optimize the storage and processing needs of these data. These needs are described in the ATLAS Computing Model, which embraces Grid concepts. The output coming from the Event Filter will consist of three main streams: a primary stream, the express stream and the calibration stream. The calibration stream will be transferred to the Tier-0 facilities which will allow the prompt reconstruction of this stream with an admissible latency of 8 hours, producing calibration constants of sufficient quality to permit a first-pass processing. An independent calibration stream is developed and tested, which selects tracks at the level-2 trigger (LVL2) after the reconstruction. The stream is composed of raw data, in byte-stream format, and contains only information of the relevant parts of the detector, in particular the hit information of the selected tracks. This leads to a significantly improved bandwidth usage and storage capability. The stream will be used to derive and update the calibration and alignment constants if necessary every 24h. Processing is done using specialized algorithms running in Athena framework in dedicated Tier-0 resources, and the alignment constants will be stored and distributed using the COOL conditions database infrastructure. The work is addressing in particular the alignment requirements, the needs for track and hit selection, timing and bandwidth issues.

022019
The following article is Open access

, , , , and

The Resistive Plate Chamber system is composed by 912 double-gap chambers equipped with about 104 front-end boards. The correct and safe operation of the RPC system requires a sophisticated and complex online Detector Control System, able to monitor and control 2·104 hardware devices distributed on an area of about 5000 m2. The RPC DCS acquires, monitors and stores about 105 parameters coming from the detector, the electronics, the power system, the gas, and cooling systems. The DCS system and its first results, obtained during the 2007 and 2008 CMS cosmic runs, will be described in this paper.

022020
The following article is Open access

, and

The LHCb experiment at the LHC accelerator at CERN will collide particle bunches at 40 MHz. After a first level of hardware trigger with output at 1 MHz, the physically interesting collisions will be selected by running dedicated trigger algorithms, the High Level Trigger (HLT), in the Online computing farm. This farm consists of 16000 CPU cores and 40 TB of storage space. Although limited by environmental constraints, the computing power is equivalent to that provided by all Tier-1's to LHCb. The HLT duty cycle follows the LHC collisions, thus it has several months of winter shutdown, as well as several shorter machine and experiment downtime periods. This work describes the strategy for using these idle resources for event reconstruction. Due to the specific features of the Online Farm, typical processing à la Tier-1 (1 file per core) is not feasible. A radically different approach has been chosen, based on parallel processing the data in farm slices of Script O(1000) cores. Single events are read from the input files, distributed to the cluster and merged back into files once they have been processed. A detailed description of this architectural solution, the obtained performance and how it will be connected to the LHCb production system will be described.

022021
The following article is Open access

and

The ATLAS experiment [1] is one of two general-purpose experiments at the Large Hadron Collider (LHC). It has a three-level trigger, designed to reduce the 40 MHz bunch-crossing rate to about 200Hz for recording. Online track reconstruction, an essential ingredient to achieve this design goal, is performed at the software-based second (Level 2) and third (Event Filter, EF) levels, running on farms of commercial PCs. The Level 2 trigger, designed to provide about a 50-fold reduction in the event rate with an average execution time of about 40 ms, uses custom fast tracking algorithms, doing complementary pattern recognition on data either from the silicon detectors or from the transition-radiation tracker. The EF uses offline software components and has been designed to give about a further 10-fold rate reduction with an average execution time of about 4 s. We report on the commissioning of the tracking algorithms and their performance with cosmic-ray data collected recently in the first combined running with the whole detector fully assembled. We describe customizations to the algorithms to have close to 100% efficiency for cosmic tracks that are used for the alignment of the trackers, since they are normally tuned for tracks originating from around the beampipe.

022022
The following article is Open access

, , , , , , , , , et al

ALICE (A Large Ion Collider Experiment) is the heavy-ion detector designed to study the physics of strongly interacting matter and the quark-gluon plasma at the CERN Large Hadron Collider (LHC). A large bandwidth and flexible Data Acquisition System (DAQ) has been designed and deployed to collect sufficient statistics in the short running time available per year for heavy ions and to accommodate very different requirements originated from the 18 sub-detectors. This paper will present the large scale tests conducted to assess the standalone DAQ performances, the interfaces with the other online systems and the extensive commissioning performed in order to be fully prepared for physics data taking. It will review the experience accumulated since May 2007 during the standalone commissioning of the main detectors and the global cosmic runs and the lessons learned from this exposure on the "battle field". It will also discuss the test protocol followed to integrate and validate each sub-detector with the online systems and it will conclude with the first results of the LHC injection tests and startup in September 2008. Several papers of the same conference present in more details some elements of the ALICE DAQ system.

022023
The following article is Open access

, , , , , , , , , et al

ALICE is one of the four experiments installed at the CERN Large Hadron Collider (LHC), especially designed for the study of heavy-ion collisions. The online Data Quality Monitoring (DQM) is an important part of the data acquisition (DAQ) software. It involves the online gathering, the analysis by user-defined algorithms and the visualization of monitored data. This paper presents the final design, as well as the latest and coming features, of the ALICE's specific DQM software called AMORE (Automatic MonitoRing Environment). It describes the challenges we faced during its implementation, including the performances issues, and how we tested and handled them, in particular by using a scalable and robust publish-subscribe architecture.We also review the on-going and increasing adoption of this tool amongst the ALICE collaboration and the measures taken to develop, in synergy with their respective teams, efficient monitoring modules for the sub-detectors. The related packaging and release procedure needed by such a distributed framework is also described. We finally overview the wide range of usages people make of this framework, and we review our own experience, before and during the LHC start-up, when monitoring the data quality on both the sub-detectors and the DAQ side in a real-world and challenging environment.

022024
The following article is Open access

and

Event selection in the ATLAS High Level Trigger is accomplished to a large extent by reusing software components and event selection algorithms developed and tested in an offline environment. Many of these offline software modules are not specifically designed to run in a heavily multi-threaded online data flow environment. The ATLAS High Level Trigger (HLT) framework based on the GAUDI and ATLAS ATHENA frameworks, forms the interface layer, which allows the execution of the HLT selection and monitoring code within the online run control and data flow software. While such an approach provides a unified environment for trigger event selection across all of ATLAS, it also poses strict requirements on the reused software components in terms of performance, memory usage and stability. Experience of running the HLT selection software in the different environments and especially on large multi-node trigger farms has been gained in several commissioning periods using preloaded Monte Carlo events, in data taking periods with cosmic events and in a short period with proton beams from LHC. The contribution discusses the architectural aspects of the HLT framework, its performance and its software environment within the ATLAS computing, trigger and data flow projects. Emphasis is also put on the architectural implications for the software by the use of multi-core processors in the computing farms and the experiences gained with multi-threading and multi-process technologies.

022025
The following article is Open access

, , , , , , , , , et al

DAQ-Middleware is a software framework of network-distributed DAQ system based on Robot Technology Middleware, which is an international standard of Object Management Group (OMG) in Robotics and its implementation was developed by AIST. DAQ-Component is a software unit of DAQ-Middleware. Basic components have been already developed. For examples, Gatherer is a readout component, Logger is a data logging component, Monitor is an analysis component and Dispatcher, which is connected to Gatherer as input of data path and to Logger/Monitor as output of data path. DAQ operator is a special component, which controls those components by using the control/status path. The control/status path and data path as well as XML-based system configuration and XML/HTTP-based system interface are well defined in DAQ-Middleware framework. DAQ-Middleware was adopted by experiments at J-PARC while the commissioning at the first beam had been successfully carried out. The functionality of DAQ-Middleware and the status of DAQ-Middleware at J-PARC are presented.

Poster

022026
The following article is Open access

, and

Increasing utilization of the Internet and convenient web technologies has made the web-portal a major application interface for remote participation and control of scientific instruments. While web-portals have provided a centralized gateway for multiple computational services, the amount of visual output often is overwhelming due to the high volume of data generated by complex scientific instruments and experiments. Since each scientist may have different priorities and areas of interest in the experiment, filtering and organizing information based on the individual user's need can increase the usability and efficiency of a web-portal.

DIII-D is the largest magnetic nuclear fusion device in the US. A web-portal has been designed to support the experimental activities of DIII-D researchers worldwide. It offers a customizable interface with personalized page layouts and list of services for users to select. Each individual user can create a unique working environment to fit his own needs and interests. Customizable services are: real-time experiment status monitoring, diagnostic data access, interactive data analysis and visualization. The web-portal also supports interactive collaborations by providing collaborative logbook, and online instant announcement services.

The DIII-D web-portal development utilizes multi-tier software architecture, and Web 2.0 technologies and tools, such as AJAX and Django, to develop a highly-interactive and customizable user interface.

022027
The following article is Open access

, , , , , , , , , et al

All major experiments need tools that provide a way to keep a record of the events and activities, both during commissioning and operations. In ALICE (A Large Ion Collider Experiment) at CERN, this task is performed by the Alice Electronic Logbook (eLogbook), a custom-made application developed and maintained by the Data-Acquisition group (DAQ). Started as a statistics repository, the eLogbook has evolved to become not only a fully functional electronic logbook, but also a massive information repository used to store the conditions and statistics of the several online systems. It's currently used by more than 600 users in 30 different countries and it plays an important role in the daily ALICE collaboration activities. This paper will describe the LAMP (Linux, Apache, MySQL and PHP) based architecture of the eLogbook, the database schema and the relevance of the information stored in the eLogbook to the different ALICE actors, not only for near real time procedures but also for long term data-mining and analysis. It will also present the web interface, including the different used technologies, the implemented security measures and the current main features. Finally it will present the roadmap for the future, including a migration to the web 2.0 paradigm, the handling of the database ever-increasing data volume and the deployment of data-mining tools.

022028
The following article is Open access

, , , , , , , , , et al

The precision chambers of the ATLAS Muon Spectrometer are built with Monitored Drift Tubes (MDT). The requirement of high accuracy and low systematic error, to achieve a transverse momentum resolution of 10% at 1 TeV, can only be accomplished if the calibrations are known with an accuracy of 20 μm. The relation between the drift path and the measured time (the socalled r-t relation) depends on many parameters (temperature T, hit rate, gas composition, thresholds,...) subject to time variations. The r-t relation has to be measured from the data without the use of an external detector, using the autocalibration technique. This method relies on an iterative procedure applied to the same data sample, starting from a preliminary set of constants. The required precision can be achieved using a large (few thousand) number of non-parallel tracks crossing a region, called calibration region, i.e. the region of the MDT chamber sharing the same r-t relation.

022029
The following article is Open access

, , and

The web system described here provides features to monitor the ATLAS Detector Control System (DCS) acquired data. The DCS is responsible for overseeing the coherent and safe operation of the ATLAS experiment hardware. In the context of the Hadronic Tile Calorimeter Detector (TileCal), it controls the power supplies of the readout electronics acquiring voltages, currents, temperatures and coolant pressure measurements. The physics data taking requires the stable operation of the power sources. The TileDCS Web System retrieves automatically data and extracts the statistics for given periods of time. The mean and standard deviation outcomes are stored as XML files and are compared to preset thresholds.

Further, a graphical representation of the TileCal cylinders indicates the state of the supply system of each detector drawer. Colors are designated for each kind of state. In this way problems are easier to find and the collaboration members can focus on them. The user selects a module and the system presents detailed information. It is possible to verify the statistics and generate charts of the parameters over the time. The TileDCS Web System also presents information about the power supplies latest status. One wedge is colored green whenever the system is on. Otherwise it is colored red. Furthermore, it is possible to perform customized analysis. It provides search interfaces where the user can set the module, parameters, and the time period of interest. The system also produces the output of the retrieved data as charts, XML files, CSV and ROOT files according to the user's choice.

022030
The following article is Open access

and

The Tile Calorimeter (TileCal) is the barrel hadronic calorimeter of the ATLAS experiment presently in an advanced state of commissioning with cosmic and single beam data at the LHC collider.

The complexity of the detector, the number of electronics channels and the high rate of acquired events requires a systematic strategy of the System Preparation for the Data Taking. This is done through a precise calibration of the detector, prompt update of the Database reconstruction constants, validation of the Data Processing and assessment of quality of the data both with calibration signals as well as data obtained with cosmic muons and the first LHC beam.

This article will present the developed strategies and tools to calibrate the calorimeter and to monitor the variations of the extracted calibration constants as a function of time; the present plan and future upgrades to deploy and update the detector constants used in reconstruction; the techniques employed to validate the reconstruction software; the set of tools of the present TileCal data quality system and its integration in ATLAS online and offline frameworks

022031
The following article is Open access

and

In the SMI++ framework, the real world is viewed as a collection of objects behaving as finite state machines. These objects can represent real entities, such as hardware devices or software tasks, or they can represent abstract subsystems. A special language SML (State Manager Language) is provided for the object description. The SML description is then interpreted by a Logic Engine (coded in C++) to drive the Control System. This allows rule based automation and error recovery. SMI++ objects can run distributed over a variety of platforms, all communication being handled transparently by an underlying communication system – DIM (Distributed Information Management). This framework has been first used by the DELPHI experiment at CERN since 1990 and subsequently by BaBar experiment at SLAC since 1999 for the design and implementation of their experiment control. SMI++ has been adopted at CERN by all LHC experiments in their detector control systems as recommended by the Joint Controls Project. Since then it has undergone many upgrades to provide for varying needs. The main features of the framework and in particular of SML language as well as recent and near future upgrades will be discussed. SMI++ has, so far, been used only by large particle physics experiments. It is, however, equally suitable for any other control applications.

022032
The following article is Open access

, and

The limitations of presently available data on pT range are discussed and planned future upgrades are outlined. Special attention is given to the FAIR-CBM experiment as a unique high luminosity facility for future continuation of the measurements at very high pT with emphasis on the so-called mosaic trigger system to use the highly parallel online algorithm.

022033
The following article is Open access

, and

LHCb is one of the four major experiments under completion at the Large Hadron Collider (LHC). Monitoring the quality of the acquired data is important, because it allows the verification of the detector performance. Anomalies, such as missing values or unexpected distributions can be indicators of a malfunctioning detector, resulting in poor data quality.

Spotting faulty or ageing components can be either done visually using instruments, such as the LHCb Histogram Presenter, or with the help of automated tools. In order to assist detector experts in handling the vast monitoring information resulting from the sheer size of the detector, we propose a graph based clustering tool combined with machine learning algorithm and demonstrate its use by processing histograms representing 2D hitmaps events. We prove the concept by detecting ion feedback events in the LHCb experiment's RICH subdetector.

022034
The following article is Open access

and

The ATLAS detector at the Large Hadron Collider is expected to collect an unprecedented wealth of new data at a completely new energy scale. In particular its Liquid Argon electromagnetic and hadronic calorimeters will play an essential role in measuring final states with electrons and photons and in contributing to the measurement of jets and missing transverse energy. Efficient monitoring of data will be crucial from the earliest data taking onward and is implemented at multiple levels of the read-out and triggering systems. By providing essential information about the performance of each partition and its impact on physics quantities, the monitoring will be crucial in guaranteeing data to be ready for physics analysis. The tools and criteria for monitoring the Liquid Argon data in the cosmics data taking will be discussed. The software developed for the monitoring of collision data will be described and the results of monitoring performance for data obtained from cosmics data will be presented.

022035
The following article is Open access

, , , , , and

The start of collisions at the LHC brings a new era of particle physics and much improved potential to observe signatures of new physics. Some of these may be evident already from the very beginning of collisions. It's essential at this point in the experiment to be prepared to quickly and efficiently determine the quality of the incoming data. Easy visualization of data for the shift crew and experts is one of the key factors in the data quality assessment process. This paper describes the design and implementation of the Data Quality Monitoring Display and discusses experience from its usage and performance during ATLAS commissioning with cosmic ray and single beam data.

022036
The following article is Open access

, , and

DAQ-Middleware is a framework for the DAQ system which is based on RT-Middleware (Robot Technology Middleware) and dedicated to making DAQ systems. DAQ-Middleware has come into use as a one of the DAQ system framework for the next generation Particle Physics experiment at KEK in recent years. DAQ-Middleware comprises DAQ-Components with all necessary basic functions of the DAQ and is easily extensible. So, using DAQ-Middleware, you are able to construct easily your own DAQ system by combining these components. As an example, we have developed a DAQ system for a CC/NET [1] using DAQ-Middleware by the addition of GUI part and CAMAC readout part. The CC/NET, the CAMAC controller was developed to accomplish high speed read-out of CAMAC data. The basic design concept of CC/NET is to realize data taking through networks. So, it is consistent with the DAQ-Middleware concept. We show how it is convenient to use DAQ-Middleware.

022037
The following article is Open access

, , , , , , , , , et al

At the ATLAS experiment, the Detector Control System (DCS) is used to oversee detector conditions and supervise the running of equipment. It is essential that information from the DCS about the status of individual sub-detectors be extracted and taken into account when determining the quality of data taken and its suitability for different analyses. DCS information is written to the ATLAS conditions database and then summarised to provide a status flag for each sub-detector and displayed on the web. We discuss how this DCS information should be used, and the technicalities of making this summary.

022038
The following article is Open access

, , , , , , , , , et al

The CMS event builder assembles events accepted by the first level trigger and makes them available to the high-level trigger. The event builder needs to handle a maximum input rate of 100 kHz and an aggregated throughput of 100 GB/s originating from approximately 500 sources. This paper presents the chosen hardware and software architecture. The system consists of 2 stages: an initial pre-assembly reducing the number of fragments by one order of magnitude and a final assembly by several independent readout builder (RU-builder) slices. The RU-builder is based on 3 separate services: the buffering of event fragments during the assembly, the event assembly, and the data flow manager. A further component is responsible for handling events accepted by the high-level trigger: the storage manager (SM) temporarily stores the events on disk at a peak rate of 2 GB/s until they are permanently archived offline. In addition, events and data-quality histograms are served by the SM to online monitoring clients. We discuss the operational experience from the first months of reading out cosmic ray data with the complete CMS detector.

022039
The following article is Open access

, , and

The CMS Data Acquisition System consists of O(20000) interdependent services. A system providing exception and application-specific monitoring data is essential for the operation of such a cluster. Due to the number of involved services the amount of monitoring data is higher than a human operator can handle efficiently. Thus moving the expert-knowledge for error analysis from the operator to a dedicated system is a natural choice. This reduces the number of notifications to the operator for simpler visualization and provides meaningful error cause descriptions and suggestions for possible countermeasures. This paper discusses an architecture of a workflow-based hierarchical error analysis system based on Guardians for the CMS Data Acquisition System. Guardians provide a common interface for error analysis of a specific service or subsystem. To provide effective and complete error analysis, the requirements regarding information sources, monitoring and configuration, are analyzed. Formats for common notification types are defined and a generic Guardian based on Event-Condition-Action rules is presented as a proof-of-concept.

022040
The following article is Open access

and

The ATLAS BPTX stations are comprised of electrostatic button pick-up detectors, located 175 m away along the beam pipe on both sides of ATLAS. The pick-ups are installed as a part of the LHC beam instrumentation and used by ATLAS for timing purposes. The signals from the ATLAS BPTX detectors are used both in the trigger system and for a stand-alone monitoring system for the LHC beams and timing signals. The monitoring software measures the phase between collisions and clock with high accuracy in order to guarantee a stable phase relationship for optimal signal sampling in the sub-detector front-end electronics. It also measures the properties of the individual bunches and the structure of the beams. In this paper, the BPTX monitoring software is described, its algorithms explained and a few example monitoring displays shown. In addition, results from the monitoring system during the first period of single beam running in September 2008 are presented.

022041
The following article is Open access

and

The ATLAS detector is undergoing intense commissioning effort with cosmic rays preparing for the first LHC collisions late 2009. Combined runs with all of the ATLAS subsystems are being taken in order to evaluate the detector performance. This is an unique opportunity also for the trigger system to be studied with different detector operation modes, such as different event rates and detector configuration. The ATLAS trigger starts with a hardware based system which tries to identify detector regions where interesting physics objects may be found (eg: large transverse energy depositions in the calorimeter system). An accepted event will be further processed by more complex software algorithms at the second level where detailed features are extracted (full detector granularity data for small portions of the detector is available). Events accepted at this level will be further processed at the so-called event filter level. Full detector data at full granularity is available for offline-like processing with complete calibration to achieve the final decision. This year we could extend our experience by including more algorithms at the second level and event filter calorimeter trigger. Clustering algorithms for electrons, photons, taus, jets and missing transverse energy are being commissioned during this combined run period. We report the latest results for such algorithms. Issues such as hot calorimeter regions identification, processing time for the algorithms, data access (specially at the second level) are being evaluated. Intense usage of online and quasi-online (during offline reconstruction of runs) monitoring helps to trace and fix problems.

022042
The following article is Open access

, , , , , , , , , et al

The CMS data acquisition system comprises O(20000) interdependent services that need to be monitored in near real-time. The ability to monitor a large number of distributed applications accurately and effectively is of paramount importance for robust operations. Application monitoring entails the collection of a large number of simple and composed values made available by the software components and hardware devices. A key aspect is that detection of deviations from a specified behaviour is supported in a timely manner, which is a prerequisite in order to take corrective actions efficiently. Given the size and time constraints of the CMS data acquisition system, efficient application monitoring is an interesting research problem. We propose an approach that uses the emerging paradigm of Web-service based eventing systems in combination with hierarchical data collection and load balancing. Scalability and efficiency are achieved by a decentralized architecture, splitting up data collections into regions of collections. An implementation following this scheme is deployed as the monitoring infrastructure of the CMS experiment at the Large Hadron Collider. All services in this distributed data acquisition system are providing standard web service interfaces via XML, SOAP and HTTP [15,22]. Continuing on this path we adopted WS-* standards implementing a monitoring system layered on top of the W3C standards stack. We designed a load-balanced publisher/subscriber system with the ability to include high-speed protocols [10,12] for efficient data transmission [11,13,14] and serving data in multiple data formats.

022043
The following article is Open access

and

The ATLAS experiment triggering system is distributed across large farms of approximately 6000 nodes. Online monitoring and data quality runs alongside this system. We have designed a mechanism that integrates the monitoring data from different nodes and makes it available for shift crews. This integration includes adding or averaging of histograms, summing of trigger rates, and combination of more complex data types across multiple monitoring processes. Performance milestones have been achieved which ensure the needs of early datataking will be met. We will present a detailed description of the architectural features and performance.

022044
The following article is Open access

and

For the ALICE heavy-ion experiment a large computing cluster will be used to perform the last triggering stages in the High Level Trigger (HLT). For the first year of operation the cluster consisted of about 100 multi-processing nodes with 4 or 8 CPU cores each, to be increased to more than 1000 nodes for the coming years of operation. During the commissioning phases of the detector, the preparations for first LHC beam, as well as during the periods of first LHC beam, the HLT has been used extensively already to reconstruct, compress, and display data from the different detectors. For example the HLT has been used to compress Silicon Drift Detector (SDD) data by a factor of 15, lossless, on the fly at a rate of more than 800 Hz. For ALICE's Time Projection Chamber (TPC) detector the HLT has been used to reconstruct tracks online and show the reconstructed tracks in an online event display. The event display can also display online reconstructed data from the Dimuon and Photon Spectrometer (PHOS) detectors. For the latter detector a first selection mechanism has also been put into place to select only events for forwarding to the online display in which data has passed through the PHOS detector. In this contribution we will present experiences and results from these commissioning phases.

022045
The following article is Open access

and

The CMS detector at the LHC is equipped with a high precision electromagnetic crystal calorimeter (ECAL). The crystals undergo a transparency change when exposed to radiation during LHC operation, which recovers in absence of irradiation on the time scale of hours. This change of the crystal response is monitored with a laser system which performs a transparency measurement of each crystal every twenty minutes. The monitoring data is analyzed on a PC farm attached to the central data acquisition system of CMS. After analyzing the raw data, a reduced data set is stored in the Online Master Data Base (OMDS) which is connected to the online computing infrastructure of CMS. The data stored in OMDS, representing the largest data set stored in OMDS for ECAL, contains all the necessary information to perform detailed crystal response monitoring, as well as an analysis of the dynamics of the transparency change. For the CMS physics event data reconstruction, only a reduced set of information from the transparency measurement is required. This data is stored in the Off-line Reconstruction Conditions Database Off-line subset (ORCOFF). To transfer the data from the OMDS to ORCOFF, the reduced data is transferred to Off-line Reconstruction Conditions DB On-line subset (ORCON) in a procedure known as Online to Offline transfer, which includes various checks for data consistency. In this talk we describe the laser monitoring work flow and the specifics of the database usage for the ECAL laser monitoring system. The strategy implemented to optimize the data transfer and to perform quality checks is being presented.

022046
The following article is Open access

and

Recording and storing the Tile Calorimeter data at 100 KHz frequency is an important task in ATLAS experiment processing. At this moment Amplitude, Time and Quality Factor (QF) parameters are calculated using Optimal Filtering Reconstruction method. If QF is considered good enough, these three parameters are only stored, otherwise the data quality is considered bad and it is proposed to store raw data for further offline analysis. Without any compression, bandwidth limitation allows to send up to 9 channels of additional raw data. Simple considerations show that when QF is bad due to the shape differences between standard pulse shape and current signal (e.g. when several signals overlap), all channels are likely to report bad QF while the contained data may still be valuable. So, the possibility to save just 9 samples is insufficient and we have to compress the data. Experiments show that standard compression tools such as RAR, ZIP, etc. cannot successfully deal with this problem because they cannot take benefit of smooth curved shape of the raw data and correlations between the channels. In the present paper a lossless data compressing algorithm is proposed which is likely to better meet existing challenges. This method has been checked on SPLASH events (run 87851, contains 26 SPLASH events) and proved to be sufficient to save ALL channels data using the existing bandwidth. Unlike the common purpose compressing tools the proposed method exploits heavily the geometry-dependent correlations between different channels. It is important to note that the method relies on the only assumption that the registered signal shape is smooth enough and it does not require exact information about the standard pulse shape function to compress the data. Thus this method can be applied for recording pilled-up or unexpected signals as well.

022047
The following article is Open access

and

The ATLAS DataFlow infrastructure is responsible for the collection and conveyance of event data from the detector front-end electronics to the mass storage. Several optimized and multi-threaded applications fulfill this purpose operating over a multi-stage Gigabit Ethernet network which is the backbone of the ATLAS Trigger and Data Acquisition System. The system must be able to efficiently transport event-data with high reliability, while providing aggregated bandwidths larger than 5 GByte/s and coping with many thousands network connections. Nevertheless, routing and streaming capabilities and monitoring and data accounting functionalities are also fundamental requirements.

During 2008, a few months of ATLAS cosmic data-taking and the first experience with the LHC beams provided an unprecedented test-bed for the evaluation of the performance of the ATLAS DataFlow, in terms of functionality, robustness and stability. Besides, operating the system far from its design specifications helped in exercising its flexibility and contributed in understanding its limitations. Moreover, the integration with the detector and the interfacing with the off-line data processing and management have been able to take advantage of this extended data taking-period as well.

In this paper we report on the usage of the DataFlow infrastructure during the ATLAS data-taking. These results, backed-up by complementary performance tests, validate the architecture of the ATLAS DataFlow and prove that the system is robust, flexible and scalable enough to cope with the final requirements of the ATLAS experiment.

022048
The following article is Open access

, , , , , , , , , et al

This contribution gives a thorough overview of the ATLAS TDAQ SysAdmin group activities which deals with administration of the TDAQ computing environment supporting High Level Trigger, Event Filter and other subsystems of the ATLAS detector operating on LHC collider at CERN. The current installation consists of approximately 1500 netbooted nodes managed by more than 60 dedicated servers, about 40 multi-screen user interface machines installed in the control rooms and various hardware and service monitoring machines as well. In the final configuration, the online computer farm will be capable of hosting tens of thousands applications running simultaneously. The software distribution requirements are matched by the two level NFS based solution. Hardware and network monitoring systems of ATLAS TDAQ are based on NAGIOS and MySQL cluster behind it for accounting and storing the monitoring data collected, IPMI tools, CERN LANDB and the dedicated tools developed by the group, e.g. ConfdbUI. The user management schema deployed in TDAQ environment is founded on the authentication and role management system based on LDAP. External access to the ATLAS online computing facilities is provided by means of the gateways supplied with an accounting system as well. Current activities of the group include deployment of the centralized storage system, testing and validating hardware solutions for future use within the ATLAS TDAQ environment including new multi-core blade servers, developing GUI tools for user authentication and roles management, testing and validating 64-bit OS, and upgrading the existing TDAQ hardware components, authentication servers and the gateways.