This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy.

Table of contents

Volume 331

2011

Previous issue Next issue

Accepted papers received: 20 November 2011
Published online: 23 December 2011

Contributed

022001
The following article is Open access

, , , , and

PIC, the Port d'Informació Científica in Barcelona, Spain has provisioned a compact, highly efficient Data Centre Module in order to expand its CPU servers at a minimal energy cost. The design aims are to build an enclosure of 30 square meters or less and equip it with commodity data centre components (for example, standard gas expansion air conditioners) which can host 80 KW of CPU servers with a PUE less than 1.7 (to be compared with PIC's legacy computer room with an estimated PUE of 2.3). Forcing the use of commodity components has lead to an interesting design, where for example a raised floor is used more as an air duct rather than to install cables, resulting in an "air conditioner which computes". The module is instrumented with many thermometers whose data will be used to compare to computer room simulation programs. Also, each electrical circuit has an electric meter, yielding detailed data on power consumption. The paper will present the first experience from operating the module. Although the module has a slightly different geometry from a "container", the results can be directly applied to them.

022002
The following article is Open access

, and

The complexity and extreme parameters of the LHC, such as the stored energy, the collision frequency, the high risk of adverse background conditions and potentially damaging beam losses have demanded an unprecedented connectivity between the operation of the accelerator and the experiments at both hardware and software level.

LHCb has been at the forefront of developing a software framework and hardware which connects to all of the LHC communication interfaces for timing, control and monitoring of the machine and beam parameters, in addition to its own local systems for beam and background monitoring. The framework also includes failsafe connectivity with the beam interlock system.

The framework drives the global operation of the detector and is integrated into the readout control. It provides the shifters with the tools needed to take fast and well-guided decisions to run the LHCb experiment safely and efficiently. In particular, it has allowed the detector to be operated with only two shifters already at the LHC pilot run. The requirements include reliability and clarity for the shifters, and the possibility to retrieve the past conditions for offline analysis. All essential parameters are archived and an interactive analysis tool has been developed which provides overviews of the experimental performance and which allows post-analysis of any anomaly in the operation.

This paper describes the architecture and the many functions, including the basis of the automation of the LHCb operational procedure and detector controls, and the information exchange between LHCb and the LHC, and finally the shifter and expert tools for monitoring the experimental conditions.

022003
The following article is Open access

, and

We report our experience on migrating STARs Online Services (Run Control System, Data Acquisition System, Slow Control System and Subsystem Monitoring) from direct read/write database accesses to a modern non-blocking message-oriented infrastructure. Based on the Advanced Messaging Queuing Protocol (AMQP) and standards, this novel approach does not specify the message data structure, allowing great flexibility in its use. After careful consideration, we chose Google Protocol Buffers as our primary (de)serialization format for structured data exchange. This migration allows us to reduce the overall system complexity and greatly improve the reliability of the metadata collection and the performance of our online services in general. We will present this new framework through its software architecture overview, providing details about our staged and non-disruptive migration process as well as details of the implementation of pluggable components to provide future improvements without compromising stability and availability of services.

022004
The following article is Open access

, , , , , , , , , et al

The Compact Muon Solenoid (CMS) experiment has developed an electrical implementation of the S-LINK64 extension (Simple Link Interface 64 bit) operating at 400 MB/s in order to read out the detector. This paper studies a possible replacement of the existing S-LINK64 implementation by an optical link, based on 10 Gigabit Ethernet in order to fulfil larger throughput, replace aging hardware and simplify an architecture. A prototype transmitter unit has been developed based on the FPGA Altera PCI Express Development Kit with a custom firmware. A standard PC has been acted as receiving unit. The data transfer has been implemented on a stack of protocols: RDP over IP over Ethernet. This allows receiving the data by standard hardware components like PCs or network switches and NICs. The first test proved that basic exchange of the packets between transmitter and receiving unit works. The paper summarizes the status of these studies.

022005
The following article is Open access

and

Since the start of the LHC physics programme earlier this year, the ATLAS detector has been collecting proton-proton collisions at a 7 TeV center of mass energy. As the LHC luminosity rises the ATLAS trigger system must become increasingly selective to reduce the event rate from a design bunch crossing rate of 40 MHz to 200 Hz to be recorded for offline analysis. To achieve this goal, the trigger algorithms must meet challenging requirements in terms of speed and selectivity.

The ATLAS trigger is hardware based at level-1 and uses software algorithms running on a farm of commercial processors at the next two trigger levels. The calorimeter-based software algorithms perform the selection of electrons, photons, jets, taus and also events with missing transverse energy. We present the physics performance achieved during 2010 data taking, highlighting the achievements of the different signatures. Event features reconstructed by the Trigger are compared with offline reconstruction and with expectations from MC simulations. Rate stability, processing time and data access performance during different periods of data-taking are also discussed. The results presented demonstrate that the calorimeter based trigger is effective in selecting data for the ATLAS physics programme.

022006
The following article is Open access

, and

The CBM experiment at the upcoming FAIR accelerator aims to create highest baryon densities in nucleus-nucleus collisions and to explore the properties of super-dense nuclear matter. Event rates of 10 MHz are needed for high-statistics measurements of rare probes, while event selection requires complex global triggers like secondary vertex search. To meet these demands, the CBM experiment uses self-triggered detector front-ends and a data push readout architecture.

The First-level Event Selector (FLES) is the central physics selection system in CBM. It receives all hits and performs online event selection on the 1 TByte/s input data stream. The event selection process requires high-throughput event building and full event reconstruction using fast, vectorized track reconstruction algorithms. The current FLES architecture foresees a scalable high-performance computer. To achieve the high throughput and computation efficiency, all available computing devices will have to be used, in particular FPGAs at the first stages of the system and heterogeneous many-core architectures such as CPUs for efficient track reconstruction. A high-throughput network infrastructure and flow control in the system are other key aspects. In this paper, we present the foreseen architecture of the First-level Event Selector.

022007
The following article is Open access

and

In this paper we will report on the operation and the performance of the ATLAS data-flow system during the 2010 physics run of the Large Hadron Collider (LHC) at 7 TeV. The data-flow system is responsible for reading out, formatting and conveying the event data, eventually saving the selected events into the mass storage. By the second quarter of 2010, for the first time, the system will be capable of the full event building capacity and improved data-logging throughput.

We will in particular detail the tools put in place to predict and track the system working point, with the aim of optimizing the bandwidth and the computing resource sharing, and anticipate possible limits. Naturally, the LHC duty cycle, the trigger performance, and the detector configuration influence the system working point. Therefore, numerical studies of the data-flow system capabilities have been performed considering different scenarios. This is crucial for the first phase of the LHC operations where variable running conditions are anticipated due to the ongoing trigger commissioning and the detector and physics performance studies. The exploitation of these results requires to know and track the system working point, as defined by a set of many different operational parameters, e.g. rates, throughput, event size. Dedicated tools fulfill this mandate, providing integrated storage and visualization of the data-flow and network operational parameters.

022008
The following article is Open access

, , , , and

The LHCb Data Acquisition system will be upgraded to address the requirements of a 40 MHz readout of the detector. It is not obvious that a simple scale-up of the current system will be able to meet these requirements. In this work we are therefore re-evaluating various architectures and technologies using a uniform test-bed and software framework.

Infiniband is a rather uncommon technology in the domain of High Energy Physics data acquisition. It is currently mainly used in cluster based architectures. It has however interesting features which justify our interest : large bandwidth with low latency, a minimal overhead and a rich protocol suite.

An InfiniBand test-bed has been and set-up, and the purpose is to have a software interface between the core software of the event-builder and the software related to the communication protocol. This allows us to run the same event-builder over different technologies for comparisons.

We will present the test-bed architectures, and the performance of the different entities of the system, sources and destinations, according to their implementation. These results will be compared with 10 Gigabit Ethernet testbed results.

022009
The following article is Open access

, , , , , , , , , et al

The Compact Muon Solenoid (CMS) experiment at CERN is a multi-purpose experiment designed to exploit the physics of proton-proton collisions at the Large Hadron Collider collision energy (14TeV at centre of mass) over the full range of expected luminosities (up to 1034cm−2s−1). The CMS detector control system (DCS) ensures a safe, correct and efficient operation of the detector so that high quality physics data can be recorded. The system is also required to operate the detector with a small crew of experts who can take care of the maintenance of its software and hardware infrastructure. The subsystems size sum up to more than a million parameters that need to be supervised by the DCS. A cluster of roughly 100 servers is used to provide the required processing resources. A scalable approach has been chosen factorizing the DCS system as much as possible. CMS DCS has made clear a division between its computing resources and functionality by creating a computing framework allowing plugging in of functional components. DCS components are developed by the subsystems expert groups while the computing infrastructure is developed centrally. To ensure the correct operation of the detector, DCS organizes the communication between the accelerator and the experiment systems making sure that the detector is in a safe state during hazardous situations and is fully operational when stable conditions are present. This paper describes the current status of the CMS DCS focusing on operational aspects and the role of DCS in this communication.

022010
The following article is Open access

, , , , , , , , , et al

The supervisory level of the Detector Control System (DCS) of the CMS experiment is implemented using Finite State Machines (FSM), which model the behaviours and control the operations of all the sub-detectors and support services. The FSM tree of the whole CMS experiment consists of more than 30.000 nodes. An analysis of a system of such size is a complex task but is a crucial step towards the improvement of the overall performance of the FSM system. This paper presents the analysis of the CMS FSM system using the micro Common Representation Language 2 (mcrl2) methodology. Individual mCRL2 models are obtained for the FSM systems of the CMS sub-detectors using the ASF+SDF automated translation tool. Different mCRL2 operations are applied to the mCRL2 models. A mCRL2 simulation tool is used to closer examine the system. Visualization of a system based on the exploration of its state space is enabled with a mCRL2 tool. Requirements such as command and state propagation are expressed using modal mu-calculus and checked using a model checking algorithm. For checking local requirements such as endless loop freedom, the Bounded Model Checking technique is applied. This paper discusses these analysis techniques and presents the results of their application on the CMS FSM system.

022011
The following article is Open access

and

The High Level Trigger for the ALICE experiment at LHC is a powerful, sophisticated tool aimed at compressing the raw data volume and issuing selective triggers for events with desirable physics content. At its current state it integrates information from all major ALICE detectors, i. e. the inner tracking system, the time projection chamber, the electromagnetic calorimeters, the transition radiation detector and the muon spectrometer performing real-time event reconstruction.

The steam engine behind HLT is a high performance computing cluster of several hundred nodes. It has to reduce the data rate from 25 GB/s to 1.25 GB/s for fitting the DAQ mass storage bandwidth. The cluster is served by a full GigaBit Ethernet network, in addition to an InfiniBand backbone network. To cope with the great challenge of Pb+Pb collisions in autumn 2010, its performance capabilities are being enhanced with the addition of new nodes. Towards the same end the first GPU co-processors are in place.

During the first period of data taking with p+p collisions the HLT was extensively used to reconstruct, analyze and display data from the various participating detectors. Among other tasks it contributed to the monitoring of the detector performance, selected events for their calibration and efficiency studies, and estimated primary and secondary vertices from p+p collisions identifying V0 topologies. The experience gained during these first months of online operation will be presented.

022012
The following article is Open access

and

The performance of the ATLAS muon trigger has been evaluated using data from proton-proton collisions at TeV with a total integrated luminosity of approximately 94 nb−1. This paper shows the results of the study for individual algorithms composing the muon trigger, with respect to efficiency, resolution and rate.

022013
The following article is Open access

, , , , and

ATLAS is one of the 4 LHC experiments which started to be operated in the collisions mode in 2010. The ATLAS apparatus itself as well as the Trigger and the DAQ system are extremely complex facilities which have been built up by the collaboration including 144 institutes from 33 countries. The effective running of the experiment is supported by a large number of experts distributed all over the world. This paper describes the online remote monitoring system which has been developed in the ATLAS Trigger and DAQ(TDAQ) community in order to support efficient participation of the experts from remote institutes in the exploitation of the experiment. The facilities provided by the remote monitoring system are ranging from the WEB based access to the general status and data quality for the ongoing data taking session to the scalable service providing real-time mirroring of the detailed monitoring data from the experimental area to the dedicated computers in the CERN public network, where this data is made available to remote users through the same set of software tools as being used in the main ATLAS control room. The remote monitoring facilities have been put in place in 2009 to support the ATLAS commissioning and have been improved in face of the first collisions runs based on the feedback which was received from the users. Now the remote monitoring system are in mature state and being actively used by the ATLAS collaboration for running the experiment.

022014
The following article is Open access

, , , and

"Online monitor framework" is a new general software framework for online data monitoring, which provides a way to collect information from online systems, including data acquisition, and displays them to shifters far from experimental sites. "Monitor Server", a core system in this framework gathers the monitoring information from the online subsystems and the information is handled as collections of histograms named "Histogram Package". Monitor Server broadcasts the histogram packages to "Monitor Viewers", graphical user interfaces in the framework. We developed two types of the viewers with different technologies: Java and web browser. We adapted XML based file for the configuration of GUI components on the windows and graphical objects on the canvases. Monitor Viewer creates its GUIs automatically with the configuration files.This monitoring framework has been developed for the Double Chooz reactor neutrino oscillation experiment in France, but can be extended for general application to be used in other experiments. This document reports the structure of the online monitor framework with some examples from the adaption to the Double Chooz experiment.

022015
The following article is Open access

, , and

The Belle collaboration has been trying for 10 years to reveal the mystery of the current matter-dominated universe. However, much more statistics is required to search for New Physics through quantum loops in decays of B mesons. In order to increase the experimental sensitivity, the next generation B-factory, SuperKEKB, is planned. The design luminosity of SuperKEKB is 8 x 1035cm−2s−1 a factor 40 above KEKB's peak luminosity. At this high luminosity, the level 1 trigger of the Belle II experiment will stream events of 300 kB size at a 30 kHz rate. To reduce the data flow to a manageable level, a high-level trigger (HLT) is needed, which will be implemented using the full offline reconstruction on a large scale PC farm. There, physics level event selection is performed, reducing the event rate by ∼ 10 to a few kHz. To execute the reconstruction the HLT uses the offline event processing framework basf2, which has parallel processing capabilities used for multi-core processing and PC clusters. The event data handling in the HLT is totally object oriented utilizing ROOT I/O with a new method of object passing over the UNIX socket connection. Also under consideration is the use of the HLT output as well to reduce the pixel detector event size by only saving hits associated with a track, resulting in an additional data reduction of ∼ 100 for the pixel detector. In this contribution, the design and implementation of the Belle II HLT are presented together with a report of preliminary testing results.

022016
The following article is Open access

and

The LHCb experiment is considering an upgrade towards a trigger-free Data Acquisition (DAQ) system. All detector information will be readout at full collision rate (40 MHz) and sent to a large processing farm which performs event building and filtering. According to the physics simulation result, the DAQ system requires a large event builder network (also called DAQ network) with an aggregate bandwidth more than 34 Tb/s.

In this paper we present a new architecture of the event builder network, which is a multistage network using low-cost 10-Gigabit Ethernet switches. To evaluate the performance of the full scale system, several simulations have been done. This paper describes different DAQ protocols and the simulation developed for the event builder network based on the framework OMNeT++. The buffer occupancy of the switches has been studied.

022017
The following article is Open access

, and

The message logging system provides the infrastructure for all of the distributed processes in the data acquisition (DAQ) to report status messages of various severities in a consistent manner to a central location, as well as providing the tools for displaying and archiving the messages. The message logging system has been developed over a decade, and has been run successfully on CDF and CMS experiments. The most recent work to the message logging system is to build it as a stand-alone package with the name MessageFacility which works for any generic framework or applications, with NOνA as the first driving user. System designs and architectures, as well as the efforts of making it a generic library will be discussed. We also present new features that have been added.

022018
The following article is Open access

and

The Double Chooz experiment searches for the neutrino mixing angle θ13 using two identical detectors and reactor neutrinos. Each detector is composed of three sub-detectors, and each sub-detector has 390, 78, and approximately 3000 channels. In order to acquire pulse shapes of photomultipliers, the two kinds of Flash-ADC are used for two sub-detectors. Although the trigger rate from the physics process is not high, an event size is about 1 MB per event due to FADC readout and is not small. The data acquisition has to be controlled outside the neutrino laboratory site, because the detector is located in a nuclear power station, and access to the laboratory is restricted. For these reasons, we need deadtime free DAQs and trigger systems, with a low energy threshold, and the remote control and monitoring systems for all online processes. This paper presents the online data acquisition and the control systems for the Double Chooz experiment.

022019
The following article is Open access

, and

The ATLAS Trigger and Data Acquisition (TDAQ) infrastructure is responsible for filtering and transferring ATLAS experimental data from detectors to mass storage systems. It relies on a large, distributed computing system composed of thousands of software applications running concurrently. In such a complex environment, information sharing is fundamental for controlling applications behavior, error reporting and operational monitoring. During data taking, the streams of messages sent by applications and data published via information services are constantly monitored by experts to verify the correctness of running operations and to understand problematic situations. To simplify and improve system analysis and errors detection tasks, we developed the TDAQ Analytics Dashboard, a web application that aims to collect, correlate and visualize effectively this real time flow of information. The TDAQ Analytics Dashboard is composed of two main entities that reflect the twofold scope of the application. The first is the engine, a Java service that performs aggregation, processing and filtering of real time data stream and computes statistical correlation on sliding windows of time. The results are made available to clients via a simple web interface supporting SQL-like query syntax. The second is the visualization, provided by an Ajax-based web application that runs on client's browser. The dashboard approach allows to present information in a clear and customizable structure. Several types of interactive graphs are proposed as widgets that can be dynamically added and removed from visualization panels. Each widget acts as a client for the engine, querying the web interface to retrieve data with desired criteria. In this paper we present the design, development and evolution of the TDAQ Analytics Dashboard. We also present the statistical analysis computed by the application in this first period of high energy data taking operations for the ATLAS experiment.

022020
The following article is Open access

, , , , and

The architecture of the Compact Muon Solenoid (CMS) Level-1 Trigger Control and Monitoring software system is presented. This system has been installed and commissioned on the trigger online computers and is currently used for data taking. It has been designed to handle the trigger configuration and monitoring during data taking as well as all communications with the main run control of CMS. Furthermore its design has foreseen the provision of the software infrastructure for detailed testing of the trigger system during beam down time. This is a medium-size distributed system that runs over 40 PCs and 200 processes that control about 4000 electronic boards. The architecture of this system is described using the industry-standard Universal Modeling Language (UML). This way the relationships between the different subcomponents of the system become clear and all software upgrades and modifications are simplified. The described architecture has allowed for frequent upgrades that were necessary during the commissioning phase of CMS when the trigger system evolved constantly. As a secondary objective, the paper provides a UML usage example and tries to encourage the standardization of the software documentation of large projects across the LHC and High Energy Physics community.

022021
The following article is Open access

, , , , , , , , , et al

The data-acquisition system of the CMS experiment at the LHC performs the read-out and assembly of events accepted by the first level hardware trigger. Assembled events are made available to the high-level trigger which selects interesting events for offline storage and analysis. The system is designed to handle a maximum input rate of 100 kHz and an aggregated throughput of 100GB/s originating from approximately 500 sources. An overview of the architecture and design of the hardware and software of the DAQ system is given. We discuss the performance and operational experience from the first months of LHC physics data taking.

022022
The following article is Open access

and

The CMS experiment is one of the large experiments at the LHC at CERN. The CMS online data quality monitoring (DQM) system comprises a number of software components for the distribution, processing and visualization of event data. Already before the year 2010 the system had been successfully developed, deployed and operated in previous data challenges with cosmic ray muons and LHC beams. In preparation for the LHC data taking period of 2010, the performance and efficiency of the infrastructure was evaluated and a number of improvements were implemented with the goal to improve the robustness and data throughput of the system and to minimize the operation and maintenance effort. In this report the main considerations and achieved improvements are described.

022023
The following article is Open access

, , , , , , , and

We report on the progress and status of DAQ-Middleware, a software framework for a distributed data acquisition system. We made improvements in DAQ-Middleware and released package (version 1.0.0) in August 2010. We describe here its improvements in performance and the component development method we used. We also report on the current status of DAQ-Middleware for use at the Material and Life Science Experimental Facility (MLF) of the Japan Proton Accelerator Research Complex (J-PARC).

022024
The following article is Open access

and

Muon detection plays a key role at the Large Hadron Collider. The ATLAS Muon Spectrometer includes Monitored Drift Tubes (MDT) and Cathode Strip Chambers (CSC) for precision momentum measurement in the toroidal magnetic field. Resistive Plate Chambers (RPC) in the barrel region, and Thin Gap Chambers (TGC) in the end-caps, provide the level-1 trigger and a second coordinate used for tracking in conjunction with the MDT.

The Detector Control System of each subdetector is required to monitor and safely operate tens of thousand of channels, which are distributed on several subsystems, including low and high voltage power supplies, trigger and front-end electronics, currents and thresholds monitoring, alignment and environmental sensors, gas and electronic infrastructure. The system is also required to provide a level of abstraction for ease of operation as well as expert level actions and detailed analysis of archived data.

The hardware architecture and software solutions adopted are shown along with results from the commissioning phase and the routine operation with colliding beams at 3.5 + 3.5 TeV. Design peculiarities of each subsystem and their use to monitor the detector and the accelerator performance are discussed along with the effort for a simple and coherent operation in a running experiment. The material presented can be a base to future test facilities and projects.

022025
The following article is Open access

, , , , , , and

With the growth in size and complexity of High Energy Physics experiments, and the accompanying increase in the number of collaborators spread across the globe, the importance of widely relaying timely monitoring and status information has grown. To this end, we present online Web Based Monitoring solutions from the CMS experiment at CERN. The web tools developed present data to the user from many underlying heterogeneous sources, from real time messaging systems to relational databases. We provide the power to combine and correlate data in both graphical and tabular formats of interest to the experimentalist, with data such as beam conditions, luminosity, trigger rates, detector conditions and many others, allowing for flexibility on the user side. We also present some examples of how this system has been used during CMS commissioning and early beam collision running at the Large Hadron Collider.

022026
The following article is Open access

and

In March 2010 ATLAS saw the first proton-proton collisions at a center-of-mass energy of 7 TeV. Within the year, a collision rate of nearly 10 MHz was achieved. At ATLAS, events of potential physics interest are selected by a three-level trigger system, with a final recording rate of about 200 Hz. The first level (LVL1) is implemented in customized hardware, the two levels of the high level trigger (HLT) are software triggers. For the ATLAS physics program more than 500 trigger signatures are defined. The HLT Steering is responsible for testing each signature on each LVLl-accepted event, the test outcome is recorded with the event for later analysis. The steering code also ensures the independence of each signature test and an unbiased trigger decision. To minimize data readout and execution time, cached detector data and once-calculated trigger objects are reused to form the decision. In order to reduce the output rate and further limit the execution time, some signature tests are performed only on an already down scaled fraction of candidate events. For some of these signatures it is important for physics analysts to know the would-be decision on the event fraction that was not analysed due to the down-scaling. For this the HLT-Steering is equipped with a test-after-accept feature. The HLT-Steering receives the setup of the signatures from the trigger configuration system. This system dynamically provides the online setup for the LVL1 and HLT. It also archives the trigger configuration for analysis, which is crucial for understanding trigger efficiencies. We present the performance of the steering and configuration system during the first collisions and the expectations for the LHC phase 1.

022027
The following article is Open access

, , , , , , and

This first year of data taking has been of great interest, not only for the physics outcome, but also for operating the system under the environment it was designed for. The online data quality monitoring framework (DQMF) is a highly scalable distributed framework which is used to assess the operational conditions of the detector and the quality of the data. DQMF provides quick feedback to the user about the functioning and performance of the sub-detectors by performing over 75,000 advanced data quality checks, with rates varying depending on histogram update frequency. The DQM display (DQMD) is the visualisation tool with which histograms and their data quality assessments can be accessed. It allows for great flexibility for displaying histograms, their reference when applicable, configurations used for the automatics checks, data quality flags and much more. The DQM configuration is stored in a database that can be easily created and edited with the DQM Configurator tool (DQMC). This paper is describing the design and implementation of the DQMF and its display as well as the data quality performance achieved during this first year of data taking.

022028
The following article is Open access

, , , , , , , , , et al

ALICE (A Large Ion Collider Experiment) is the heavy-ion detector designed to study the physics of strongly interacting matter and the Quark-Gluon Plasma at the CERN Large Hadron Collider (LHC). A large bandwidth and flexible Data-Acquisition System (DAQ) has been designed and deployed to collect sufficient statistics in the short running time available per year for heavy ions and to accommodate very different requirements originating from the 18 sub-detectors. After several months of data taking with beam, lots of experience has been accumulated and some important developments have been initiated in order to evolve towards a more automated and reliable experiment. We will present the experience accumulated so far and the new developments. Several upgrades of existing ALICE detectors or addition of new ones have also been proposed with a significant impact on the DAQ. We will review these proposals, their implication for the DAQ and the way they will be addressed.

022029
The following article is Open access

The control system of each of the four major Experiments at the CERN Large Hadron Collider (LHC) is distributed over up to 160 computers running either Linux or Microsoft Windows. A quick response to abnormal situations of the computer infrastructure is crucial to maximize the physics usage. For this reason, a tool was developed to supervise, identify errors and troubleshoot such a large system. Although the monitoring of the performance of the Linux computers and their processes was available since the first versions of the tool, it is only recently that the software package has been extended to provide similar functionality for the nodes running Microsoft Windows as this platform is the most commonly used in the LHC detector control systems. In this paper, the architecture and the functionality of the Windows Management Instrumentation (WMI) client developed to provide centralized monitoring of the nodes running different flavour of the Microsoft platform, as well as the interface to the SCADA software of the control systems are presented. The tool is currently being commissioned by the Experiments and it has already proven to be very efficient optimize the running systems and to detect misbehaving processes or nodes.

022030
The following article is Open access

, , , , , , , , , et al

ALICE (A Large Ion Collider Experiment) is the heavy-ion detector designed to study the physics of strongly interacting matter and the quark-gluon plasma at the CERN Large Hadron Collider (LHC). The online Data Quality Monitoring (DQM) is a key element of the Data Acquisition's software chain. It provide shifters with precise and complete information to quickly identify and overcome problems, and as a consequence to ensure acquisition of high quality data. DQM typically involves the online gathering, the analysis by user-defined algorithms and the visualization of monitored data.

This paper describes the final design of ALICE'S DQM framework called AMORE (Automatic MOnitoRing Environment), as well as its latest and coming features like the integration with the offline analysis and reconstruction framework, a better use of multi-core processors by a parallelization effort, and its interface with the eLogBook. The concurrent collection and analysis of data in an online environment requires the framework to be highly efficient, robust and scalable. We will describe what has been implemented to achieve these goals and the procedures we follow to ensure appropriate robustness and performance.

We finally review the wide range of usages people make of this framework, from the basic monitoring of a single sub-detector to the most complex ones within the High Level Trigger farm or using the Prompt Reconstruction and we describe the various ways of accessing the monitoring results. We conclude with our experience, before and after the LHC startup, when monitoring the data quality in a challenging environment.

022031
The following article is Open access

, , , , and

Feasibility studies into the use of General Purpose CPUs have been performed on two key algorithms in the ATLAS High Level Trigger. A GPGPU based version of the Z-finder vertex finding algorithm resulted in over 35 times speed-up over serial CPU execution in the best case scenario, whilst a speed-up of over 5 times was observed from a GPGPU-based Kalman filter track finder routine. The approaches taken for converting these algorithms for execution on GPU devices is described.

Poster

022032
The following article is Open access

, and

The ATLAS experiment at the Large Hadron Collider at CERN relies on a complex and highly distributed Trigger and Data Acquisition system to gather and select particle collision data at unprecedented energy and rates. The main interaction point between the operator in charge of the data taking and the Trigger and Data Acquisition (TDAQ) system is the Integrated Graphical User Interface (IGUI). The tasks of the IGUI can be coarsely grouped into three categories: system status monitoring, control and configuration. Status monitoring implies the presentation of the global status of the TDAQ system and of the ATLAS run, as well as the visualization of errors and other messages generated by the system; Control includes the functionality to interact with the TDAQ Run Control and Expert System; Configuration implies the possibility to give the current status and modify some parameters of the TDAQ system configuration. This paper describes the IGUI design and implementation. Particular emphasis will be given to the design choices taken to address the main performance and functionality requirements.

022033
The following article is Open access

, , , , , and

A general purpose FPGA based DAQ module has been developed based on a Virtex-4 FPGA. It is able to acquire up to 1024 different channels distributed over 10 slave cards. The module has an optical interface a RS-232 a USB and a Gigabit Interface. The KLOE-2 experiment is going to use it to collect data from the Inner tracker and the QCALT. An embedded processor (power pc 604) is present on the FPGA and a telnet server has been developed and installed. A new general purpose data taking system has been based on this module to acquire the Inner Tracker. The system is at the moment working at LNF (Laboratori Nazionali di Frascati).

022034
The following article is Open access

, , , , , , and

ALICE (A Large Ion Collider Experiment) is the heavy-ion detector designed to study the physics of strongly interacting matter and the quark-gluon plasma at the CERN Large Hadron Collider (LHC). It includes 18 different sub-detectors and 5 online systems, each one made of many different components and developed by different teams inside the collaboration. The operation of a large experiment over several years to collect billions of events acquired in well defined conditions requires predictability and repeatability of the experiment configuration. The logistics of the operation is also a major issue and it is mandatory to reduce the size of the shift crew needed to operate the experiment. Appropriate software tools are therefore needed to automate daily operations. This ensures minimizing human errors and maximizing the data taking time. The ALICE Configuration Tool (ACT) is ALICE first step to achieve a high level of automation, implementing automatic configuration and calibration of the sub-detectors and online systems. This presentation describes the goals and architecture of the ACT, the web-based Human Interface and the commissioning performed before the start of the collisions. It also reports on the first experiences with real use in daily operations, and finally it presents the road-map for future developments.

022035
The following article is Open access

Online luminosity monitoring at the LHC is important for optimizing the luminosity at the experimental insertions and vital for good data taking of the LHC experiments. It is an important diagnostic tool, in particular for the normalization of trigger rates and data samples. The ATLAS experiment has several sub-detectors, which can be used for relative luminosity measurements. In order to collect all data from the variety of different sources and to process them centrally, the software application "Online Luminosity Calculator" has been developed.

022036
The following article is Open access

, , and

The LHCb data-flow starts from the collection of event-fragments from more than 300 read-out boards at a rate of 1 MHz. These data are moved through a large switching network consisting of more than 50 routers to an event-filter farm of up to 1500 servers. Accepted events are sent through a dedicated network to storage collection nodes which concatenate accepted events in to files and transfer them to mass-storage. At nominal conditions more than 30 million packets enter and leave the network every second. Precise monitoring of this data-flow down to the single packet counter is essential to trace rare but systematic sources of data-loss. We have developed a comprehensive monitoring framework allowing to verify the data-flow at every level using a variety of standard tools and protocols such as sFlow, SNMP and custom software based on the LHCb Experiment Control System frame-work. This paper starts from an analysis of the data-flow and the involved hardware and software layers. From this analysis it derives the architecture and finally presents the implementation of this monitoring system.

022037
The following article is Open access

, , and

A log is recording of system's activity, aimed to help system administrator to traceback an attack, find the causes of a malfunction and generally with troubleshooting. The fact that logs are the only information an administrator may have for an incident, makes logging system a crucial part of an IT infrastructure.

In large scale infrastructures, such as LHCb Online, where quite a few GB of logs are produced daily, it is impossible for a human to review all of these logs. Moreover, a great percentage of them as just "noise". That makes clear that a more automated and sophisticated approach is needed.

In this paper, we present a low-cost centralized logging system which allow us to do in-depth analysis of every log.

022038
The following article is Open access

, , , , and

A large-area Time-of-Flight (TOF) system based on Multi-gap Resistive Plate Chambers (MRPCs) has recently been installed in the STAR experiment at RHIC. The approximately 23000 detector channels are read out and digitized using custom electronics based on the CERN NINO and HPTDC chips. The data are sent to the experimental data acquisition system (DAQ) using the ALICE fiber optics based Detector Data Link (DDL). The readout system consists of a total of approximately 2100 custom electronics boards mounted directly on 120 TOF trays, as well as four DAQ and trigger interface boards outside the detector that collect data from 30 trays each and send it to DAQ. Control and monitoring of these electronics boards is done using a tiered network of CANbus connections to a control PC. We describe the physical implementation and topology of the CANbus connections and the custom protocol developed for this project. Several command-line tools as well as a Qt4-based graphical tool developed on the host side to facilitate configuration, control, and monitoring of the TOF system are also described.

022039
The following article is Open access

, , and

DAQ-Middleware is a software framework of network-distributed data acquisition system. DAQ-Middleware was developed based on Robot Technology Middleware (RT-Middleware), which is an international standard of Object Management Group (OMG) in Robotics. OpenRTM-aist is an implementation of RT-Middleware developed by the National Institute of Advanced Industrial Science and Technology. New implementation of DAQ-Middleware has been done according to the new OpenRTM-aist released early 2010 while DAQ-Middleware has been improved. Then, we measured the performance of the new DAQ-Middleware compared with previous one. We measured the throughput of DAQ-Middleware on several conditions. We observed improvement of performance in the new DAQ-Middleware.

022040
The following article is Open access

, and

The proposed method is designed for a data acquisition system acquiring data from n independent sources. The data sources are supposed to produce fragments that together constitute some logical wholeness. These fragments are produced with the same frequency and in the same sequence. The discussed algorithm aims to balance the data dynamically between m logically autonomous processing units (consisting of computing nodes) in case of variation in their processing power which could be caused by some faults like failing computing nodes, or broken network connections.

As a case study we consider the Data Acquisition System of the Compact Muon Solenoid Experiment at CERN's new Large Hadron Collider. The system acquires data from about 500 sources and combines them into full events. Each data source is expected to deliver event fragments of an average size of 2 kB with 100 kHz frequency.

In this paper we present the results of applying proposed load metric and load communication pattern. Moreover, we discuss their impact on the algorithm's overall efficiency and scalability, as well as on fault tolerance of the whole system. We also propose a general concept of an algorithm that allows for choosing the destination processing unit in all source nodes asynchronously and asserts that all fragments of same logical data always go to same unit.

022041
The following article is Open access

and

The ATLAS Level-1 trigger system is responsible for reducing the anticipated LHC collision rate from 40 MHz to less than 100 kHz. This Level-1 selection counts jet, tau/hadron, electron/photon and muon candidates, with additional triggers for missing and total energy. The results are used by the Level-1 Central Trigger to form a Level-1 Accept decision. This decision, along with timing signals, is sent to the sub-detectors from the Level-1 Central trigger, while summary information is passed into the higher levels of the trigger system. The performance of the Central Trigger during the first collisions will be shown. This includes details of how the trigger information, along with dead-time rates, are monitored and logged by the online system for physics analysis, data quality assurance and operational debugging. Also presented are the software tools used to efficiently display the relevant information in the control room in a way useful for shifters and experts.

022042
The following article is Open access

, , , , , , , , , et al

The complexity of the ATLAS experiment motivated the deployment of an integrated Access Control System in order to guarantee safe and optimal access for a large number of users to the various software and hardware resources. Such an integrated system was foreseen since the design of the infrastructure and is now central to the operations model. In order to cope with the ever growing needs of restricting access to all resources used within the experiment, the Roles Based Access Control (RBAC) previously developed has been extended and improved. The paper starts with a short presentation of the RBAC design, implementation and the changes made to the system to allow the management and usage of roles to control access to the vast and diverse set of resources. The RBAC implementation uses a directory service based on Lightweight Directory Access Protocol to store the users (∼3000), roles (∼320), groups (∼80) and access policies. The information is kept in sync with various other databases and directory services: human resources, central CERN IT, CERN Active Directory and the Access Control Database used by DCS. The paper concludes with a detailed description of the integration across all areas of the system.