Journal of Physics: Conference Series, Volume 368, 2012

011001

The following article is Open access

14th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2011)

Liliana Teodorescu, David Britton, Nigel Glover, Gudrun Heinrich, Jérôme Lauret, Axel Naumann, Thomas Speer and Pedro Teixeira-Dias

View article, 14th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2011) PDF, 14th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2011)

This volume of Journal of Physics: Conference Series is dedicated to scientific contributions presented at the 14th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2011) which took place on 5–7 September 2011 at Brunel University, UK.

The workshop series, which began in 1990 in Lyon, France, brings together computer science researchers and practitioners, and researchers from particle physics and related fields in order to explore and confront the boundaries of computing and of automatic data analysis and theoretical calculation techniques. It is a forum for the exchange of ideas among the fields, exploring and promoting cutting-edge computing, data analysis and theoretical calculation techniques in fundamental physics research.

This year's edition of the workshop brought together over 100 participants from all over the world. 14 invited speakers presented key topics on computing ecosystems, cloud computing, multivariate data analysis, symbolic and automatic theoretical calculations as well as computing and data analysis challenges in astrophysics, bioinformatics and musicology. Over 80 other talks and posters presented state-of-the art developments in the areas of the workshop's three tracks: Computing Technologies, Data Analysis Algorithms and Tools, and Computational Techniques in Theoretical Physics. Panel and round table discussions on data management and multivariate data analysis uncovered new ideas and collaboration opportunities in the respective areas.

This edition of ACAT was generously sponsored by the Science and Technology Facility Council (STFC), the Institute for Particle Physics Phenomenology (IPPP) at Durham University, Brookhaven National Laboratory in the USA and Dell.

We would like to thank all the participants of the workshop for the high level of their scientific contributions and for the enthusiastic participation in all its activities which were, ultimately, the key factors in the success of the workshop.

Further information on ACAT 2011 can be found at http://acat2011.cern.ch

Dr Liliana Teodorescu Brunel University

The PDF also contains details of the workshop's committees and sponsors.

https://doi.org/10.1088/1742-6596/368/1/011001

011002

The following article is Open access

Peer review statement

View article, Peer review statement PDF, Peer review statement

All papers published in this volume of Journal of Physics: Conference Series have been peer reviewed through processes administered by the proceedings Editors. Reviews were conducted by expert referees to the professional and scientific standards expected of a proceedings journal published by IOP Publishing.

https://doi.org/10.1088/1742-6596/368/1/011002

012001

The following article is Open access

The ADAM project: a generic web interface for retrieval and display of ATLAS TDAQ information

A Harwood, G Lehmann Miotto, L Magnoni, W Vandelli and D Savu

View article, The ADAM project: a generic web interface for retrieval and display of ATLAS TDAQ information PDF, The ADAM project: a generic web interface for retrieval and display of ATLAS TDAQ information

This paper describes a new approach to the visualization of information about the operation of the ATLAS Trigger and Data Acquisition system. ATLAS is one of the two general purpose detectors positioned along the Large Hadron Collider at CERN. Its data acquisition system consists of several thousand computers interconnected via multiple gigabit Ethernet networks, that are constantly monitored via different tools. Operational parameters ranging from the temperature of the computers to the network utilization are stored in several databases for later analysis. Although the ability to view these data-sets individually is already in place, currently there is no way to view this data together, in a uniform format, from one location. The ADAM project has been launched in order to overcome this limitation. It defines a uniform web interface to collect data from multiple providers that have different structures. It is capable of aggregating and correlating the data according to user defined criteria. Finally, it visualizes the collected data using a flexible and interactive front-end web system. Structurally, the project comprises of 3 main levels of the data collection cycle: The Level 0 represents the information sources within ATLAS. These providers do not store information in a uniform fashion. The first step of the project was to define a common interface with which to expose stored data. The interface designed for the project originates from the Google Data Protocol API. The idea is to allow read-only access to data providers, through HTTP requests similar in format to the SQL query structure. This provides a standardized way to access this different information sources within ATLAS. The Level 1 can be considered the engine of the system. The primary task of the Level 1 is to gather data from multiple data sources via the common interface, to correlate this data together, or over a defined time series, and expose the combined data as a whole to the Level 2 web interface. The Level 2 is designed to present the data in a similar style and aesthetic, despite the different data sources. Pages can be constructed, edited and personalized by users to suit the specific data being shown. Pages can show a collection of graphs displaying data potentially coming from multiple sources. The project as a whole has a great amount of scope thanks to the uniform approach chosen for exposing data, and the flexibility of the Level 2 in presenting results. The paper will describe in detail the design and implementation of this new tool. In particular we will go through the project architecture, the implementation choices and the examples of usage of the system in place within the ATLAS TDAQ infrastructure.

https://doi.org/10.1088/1742-6596/368/1/012001

012002

The following article is Open access

A persistent back-end for the ATLAS TDAQ online information service (P-BEAST)

Alexandru D Sicoe, Giovanna Lehmann Miotto, Luca Magnoni, Serguei Kolos and Igor Soloviev

View article, A persistent back-end for the ATLAS TDAQ online information service (P-BEAST) PDF, A persistent back-end for the ATLAS TDAQ online information service (P-BEAST)

This paper describes P-BEAST, a highly scalable, highly available and durable system for archiving monitoring information of the trigger and data acquisition (TDAQ) system of the ATLAS experiment at CERN. Currently this consists of 20,000 applications running on 2,400 interconnected computers but it is foreseen to grow further in the near future. P-BEAST stores considerable amounts of monitoring information which would otherwise be lost. Making this data accessible, facilitates long term analysis and faster debugging. The novelty of this research consists of using a modern key-value storage technology (Cassandra) to satisfy the massive time series data rates, flexibility and scalability requirements entailed by the project. The loose schema allows the stored data to evolve seamlessly with the information flowing within the Information Service. An architectural overview of P-BEAST is presented alongside a discussion about the technologies considered as candidates for storing the data. The arguments which ultimately lead to choosing Cassandra are explained. Measurements taken during operation in production environment illustrate the data volume absorbed by the system and techniques for reducing the required Cassandra storage space overhead.

https://doi.org/10.1088/1742-6596/368/1/012002

012003

The following article is Open access

Online measurement of LHC beam parameters with the ATLAS High Level Trigger

E Strauss

View article, Online measurement of LHC beam parameters with the ATLAS High Level Trigger PDF, Online measurement of LHC beam parameters with the ATLAS High Level Trigger

We present an online measurement of the LHC beamspot parameters in ATLAS using the High Level Trigger (HLT). When a significant change is detected in the measured beamspot, it is distributed to the HLT. There, trigger algorithms like b-tagging which calculate impact parameters or decay lengths benefit from a precise, up-to-date set of beamspot parameters. Additionally, online feedback is sent to the LHC operators in real time. The measurement is performed by an algorithm running on the Level 2 trigger farm, leveraging the high rate of usable events. Dedicated algorithms perform a full scan of the silicon detector to reconstruct event vertices from registered tracks. The distribution of these vertices is aggregated across the farm and their shape is extracted through fits every 60 seconds to determine the beamspot position, size, and tilt. The reconstructed beamspot values are corrected for detector resolution effects, measured in situ using the separation of vertices whose tracks have been split into two collections. Furthermore, measurements for individual bunch crossings have allowed for studies of single-bunch distributions as well as the behavior of bunch trains. This talk will cover the constraints imposed by the online environment and describe how these measurements are accomplished with the given resources. The algorithm tasks must be completed within the time constraints of the Level 2 trigger, with limited CPU and bandwidth allocations. This places an emphasis on efficient algorithm design and the minimization of data requests.

https://doi.org/10.1088/1742-6596/368/1/012003

012004

The following article is Open access

The AAL project: automated monitoring and intelligent analysis for the ATLAS data taking infrastructure

A Kazarov, G Lehmann Miotto and L Magnoni

View article, The AAL project: automated monitoring and intelligent analysis for the ATLAS data taking infrastructure PDF, The AAL project: automated monitoring and intelligent analysis for the ATLAS data taking infrastructure

The Trigger and Data Acquisition (TDAQ) system of the ATLAS experiment at CERN is the infrastructure responsible for collecting and transferring ATLAS experimental data from detectors to the mass storage system. It relies on a large, distributed computing environment, including thousands of computing nodes with thousands of application running concurrently. In such a complex environment, information analysis is fundamental for controlling applications behavior, error reporting and operational monitoring. During data taking runs, streams of messages sent by applications via the message reporting system together with data published from applications via information services are the main sources of knowledge about correctness of running operations. The flow of data produced (with an average rate of O(1-10KHz)) is constantly monitored by experts to detect problem or misbehavior. This requires strong competence and experience in understanding and discovering problems and root causes, and often the meaningful information is not in the single message or update, but in the aggregated behavior in a certain time-line. The AAL project is meant at reducing the man power needs and at assuring a constant high quality of problem detection by automating most of the monitoring tasks and providing real-time correlation of data-taking and system metrics. This project combines technologies coming from different disciplines, in particular it leverages on an Event Driven Architecture to unify the flow of data from the ATLAS infrastructure, on a Complex Event Processing (CEP) engine for correlation of events and on a message oriented architecture for components integration. The project is composed of 2 main components: a core processing engine, responsible for correlation of events through expert-defined queries and a web based front-end to present real-time information and interact with the system. All components works in a loose-coupled event based architecture, with a message broker to centralize all communication between modules. The result is an intelligent system able to extract and compute relevant information from the flow of operational data to provide real-time feedback to human experts who can promptly react when needed. The paper presents the design and implementation of the AAL project, together with the results of its usage as automated monitoring assistant for the ATLAS data taking infrastructure.

https://doi.org/10.1088/1742-6596/368/1/012004

012005

The following article is Open access

Advances in service and operations for ATLAS data management

Graeme A Stewart, Vincent Garonne, Mario Lassnig, Angelos Molfetas, Martin Barisits, Donal Zhang, Ivan Calvet, Thomas Beermann, Fernando Barreiro Megino, Andrii Tykhonov et al

View article, Advances in service and operations for ATLAS data management PDF, Advances in service and operations for ATLAS data management

ATLAS has recorded almost 5PB of RAW data since the LHC started running at the end of 2009. Many more derived data products and complimentary simulation data have also been produced by the collaboration and, in total, 70PB is currently stored in the Worldwide LHC Computing Grid by ATLAS. All of this data is managed by the ATLAS Distributed Data Management system, called Don Quixote 2 (DQ2). DQ2 has evolved rapidly to help ATLAS Computing operations manage these large quantities of data across the many grid sites at which ATLAS runs and to help ATLAS physicists get access to this data. In this paper we describe new and improved DQ2 services: popularity; space monitoring and accounting; exclusion service; cleaning agents; deletion agents. We describe the experience of data management operation in ATLAS computing, showing how these services enable management of petabyte scale computing operations. We illustrate the coupling of data management services to other parts of the ATLAS computing infrastructure, in particular showing how feedback from the distributed analysis system in ATLAS has enabled dynamic placement of the most popular data, helping users and groups to analyse the increasing data volumes on the grid.

https://doi.org/10.1088/1742-6596/368/1/012005

012006

The following article is Open access

Development of noSQL data storage for the ATLAS PanDA Monitoring System

M Potekhin and (On behalf ofthe ATLAS Collaboration)

View article, Development of noSQL data storage for the ATLAS PanDA Monitoring System PDF, Development of noSQL data storage for the ATLAS PanDA Monitoring System

For several years the PanDA Workload Management System has been the basis for distributed production and analysis for the ATLAS experiment at the LHC. Since the start of data taking PanDA usage has ramped up steadily, typically exceeding 500k completed jobs/day by June 2011. The associated monitoring data volume has been rising as well, to levels that present a new set of challenges in the areas of database scalability and monitoring system performance and efficiency. These challenges are being met with a R&D effort aimed at implementing a scalable and efficient monitoring data storage based on a noSQL solution (Cassandra). We present our motivations for using this technology, as well as data design and the techniques used for efficient indexing of the data. We also discuss the hardware requirements as they were determined by testing with actual data and realistic loads.

https://doi.org/10.1088/1742-6596/368/1/012006

012007

The following article is Open access

Integrating Amazon EC2 with the CMS production framework

Andrew Melo and Paul Sheldon

View article, Integrating Amazon EC2 with the CMS production framework PDF, Integrating Amazon EC2 with the CMS production framework

As cloud middleware and cloud providers have become more robust, various experiments with experience in Grid submission have begun to investigate the possibility of taking previously Grid-Enabled applications and making them compatible with Cloud Computing. Successful implementation will allow for dynamic scaling of the available hardware resources, providing access to peak-load handling capabilities and possibly resulting in lower costs to the experiment. Here we discuss current work within the CMS collaboration at the LHC to both perform computation on EC2, both for production and analysis use-cases. We also discuss break-even points between dedicated and cloud resources using real-world costs derived from a CMS site.

https://doi.org/10.1088/1742-6596/368/1/012007

012008

The following article is Open access

Automated quality monitoring and validation of the CMS reconstruction software

Danilo Piparo

View article, Automated quality monitoring and validation of the CMS reconstruction software PDF, Automated quality monitoring and validation of the CMS reconstruction software

A crucial component of the CMS Software is the reconstruction, which translates the signals coming from the detector's readout electronics into concrete physics objects such as leptons, photons and jets. Given its relevance for all physics analyses, the behaviour and quality of the reconstruction code must be carefully monitored. In particular, the compatibility of its outputs between subsequent releases and the impact of the usage of new algorithms must be carefully assessed. The automated procedure adopted by CMS to accomplish this ambitious task and the innovative tools developed for that purpose are presented. The whole chain of steps is illustrated, starting from the application testing over large ensembles of datasets to emulate Tier-0, Tier-1 and Tier-2 environments, to the collection of the physical quantities in the form of several hundred thousand histograms, to the estimation of their compatibility between releases, to the final production and publication of reports characterised by an efficient representation of the information.

https://doi.org/10.1088/1742-6596/368/1/012008

012009

The following article is Open access

LHCb distributed computing operations

F Stagni, R Santinelli, M Cattaneo and S Roiser

View article, LHCb distributed computing operations PDF, LHCb distributed computing operations

In the Grid world, there are many tools for monitoring both activities and infrastructure. The huge amount of information available needs to be well organized, especially considering the pressing need for prompt reaction in case of problems impacting the activities of a large Virtual Organization. Such activities include data taking, data reconstruction, data reprocessing and user analysis. The monitoring system for the LHCb Grid Computing relies on many heterogeneous and independent sources of information. These offers different views for a better understanding of problems, while an operations team follow defined procedures that have been put in place to handle them. This work summarizes the state-of-the-art of LHCb Grid operations, emphasizing the reasons that brought to various choices, and what are the tools currently in use to run our daily activities. We highlight the most common problems experienced across years of activities on the WLCG infrastructure, the services with their criticality, the procedures in place, the relevant metrics, the tools available and the ones still missing.

https://doi.org/10.1088/1742-6596/368/1/012009

012010

The following article is Open access

The LHCb DIRAC-based production and data management operations systems

F Stagni, P Charpentier and (On behalf ofthe LHCb Collaboration)

View article, The LHCb DIRAC-based production and data management operations systems PDF, The LHCb DIRAC-based production and data management operations systems

The LHCb computing model was designed in order to support the LHCb physics program, taking into account LHCb specificities (event sizes, processing times etc...). Within this model several key activities are defined, the most important of which are real data processing (reconstruction, stripping and streaming, group and user analysis), Monte-Carlo simulation and data replication. In this contribution we detail how these activities are managed by the LHCbDIRAC Data Transformation System. The LHCbDIRAC Data Transformation System leverages the workload and data management capabilities provided by DIRAC, a generic community grid solution, to support data-driven workflows (or DAGs). The ability to combine workload and data tasks within a single DAG allows to create highly sophisticated workflows with the individual steps linked by the availability of data. This approach also provides the advantage of a single point at which all activities can be monitored and controlled. While several interfaces are currently supported (including python API and CLI), we will present the ability to create LHCb workflows through a secure web interface, control their state in addition to creating and submitting jobs. To highlight the versatility of the system we present in more detail experience with real data of the 2010 and 2011 LHC run.

https://doi.org/10.1088/1742-6596/368/1/012010

012011

The following article is Open access

Offloading peak processing to virtual farm by STAR experiment at RHIC

Jan Balewski, Jerome Lauret, Doug Olson, Iwona Sakrejda, Dmitry Arkhipkin, John Bresnahan, Kate Keahey, Jeff Porter, Justin Stevens and Matt Walker

View article, Offloading peak processing to virtual farm by STAR experiment at RHIC PDF, Offloading peak processing to virtual farm by STAR experiment at RHIC

The Virtual Machine framework was used to assemble the STAR-computing environment, validated once, deployed on over 100 8-core VMs at NERSC and Argonne National Lab, and used as a homogeneous Virtual Farm processing events acquired in real time by STAR detector located at Brookhaven National Lab. To provide time dependent calibration, a database snapshot scheme was devised. The two high capacity filesystems, localized at the opposite coasts of US and interconnected via Globus-Online protocol, were used in this setup, which resulted with a highly scalable Cloud-based extension of STAR computing resources. The system was in continuous operation for over 3 months.

https://doi.org/10.1088/1742-6596/368/1/012011

012012

The following article is Open access

Application of remote debugging techniques in user-centric job monitoring

T dos Santos, P Mättig, N Wulff, T Harenberg, F Volkmer, T Beermann, S Kalinin and R Ahrens

View article, Application of remote debugging techniques in user-centric job monitoring PDF, Application of remote debugging techniques in user-centric job monitoring

With the Job Execution Monitor, a user-centric job monitoring software developed at the University of Wuppertal and integrated into the job brokerage systems of the WLCG, job progress and grid worker node health can be supervised in real time. Imminent error conditions can thus be detected early by the submitter and countermeasures can be taken. Grid site admins can access aggregated data of all monitored jobs to infer the site status and to detect job misbehaviour. To remove the last "blind spot" from this monitoring, a remote debugging technique based on the GNU C compiler suite was developed and integrated into the software; its design concept and architecture is described in this paper and its application discussed.

https://doi.org/10.1088/1742-6596/368/1/012012

012013

The following article is Open access

Monitoring the Grid at local, national, and global levels

P D Gronbech

View article, Monitoring the Grid at local, national, and global levels PDF, Monitoring the Grid at local, national, and global levels

The World-wide LHC Computing Grid is a global infrastructure set up to process the experimental data from the experiments at the Large Hadron Collider located at CERN. The UK component is provided by the GridPP project across 19 sites at the universities and Rutherford Lab. To ensure that these large computational resources are available and reliable requires many different monitoring systems, ranging from local site monitoring of individual components, through UK-wide monitoring of Grid functionality, to the worldwide monitoring of resource provision and usage. In this paper we describe the monitoring systems used for the many different aspects of the system, and how some of them are being integrated together.

https://doi.org/10.1088/1742-6596/368/1/012013

012014

The following article is Open access

Making distributed ALICE analysis simple using the GRID plug-in

A Gheata and M Gheata

View article, Making distributed ALICE analysis simple using the GRID plug-in PDF, Making distributed ALICE analysis simple using the GRID plug-in

We have developed an interface within the ALICE analysis framework that allows transparent usage of the experiment's distributed resources. This analysis plug-in makes it possible to configure back-end specific parameters from a single interface and to run with no change the same custom user analysis in many computing environments, from local workstations to PROOF clusters or GRID resources. The tool is used now extensively in the ALICE collaboration for both end-user analysis and large scale productions.

https://doi.org/10.1088/1742-6596/368/1/012014

012015

The following article is Open access

Mass production of extensive air showers for the Pierre Auger Collaboration using Grid Technology

Julio Lozano Bahilo and (forthe Pierre Auger Collaboration)

View article, Mass production of extensive air showers for the Pierre Auger Collaboration using Grid Technology PDF, Mass production of extensive air showers for the Pierre Auger Collaboration using Grid Technology

When ultra-high energy cosmic rays enter the atmosphere they interact producing extensive air showers (EAS) which are the objects studied by the Pierre Auger Observatory. The number of particles involved in an EAS at these energies is of the order of billions and the generation of a single simulated EAS requires many hours of computing time with current processors. In addition, the storage space consumed by the output of one simulated EAS is very high. Therefore we have to make use of Grid resources to be able to generate sufficient quantities of showers for our physics studies in reasonable time periods. We have developed a set of highly automated scripts written in common software scripting languages in order to deal with the high number of jobs which we have to submit regularly to the Grid. In spite of the low number of sites supporting our Virtual Organization (VO) we have reached the top spot on CPU consumption among non LHC (Large Hadron Collider) VOs within EGI (European Grid Infrastructure).

https://doi.org/10.1088/1742-6596/368/1/012015

012016

The following article is Open access

Do regions of ALICE matter? Social relationships and data exchanges in the Grid

E D Widmer, F Carminati, C Grigoras, G Viry and G Galli Carminati

View article, Do regions of ALICE matter? Social relationships and data exchanges in the Grid PDF, Do regions of ALICE matter? Social relationships and data exchanges in the Grid

Following a previous publication [1], this study aims at investigating the impact of regional affiliations of centres on the organisation of collaboration within the Distributed Computing ALICE infrastructure, based on social networks methods. A self-administered questionnaire was sent to all centre managers about support, email interactions and wished collaborations in the infrastructure. Several additional measures, stemming from technical observations were produced, such as bandwidth, data transfers and Internet Round Trip Time (RTT) were also included. Information for 50 centres were considered (60% response rate). Empirical analysis shows that despite the centralisation on CERN, the network is highly organised by regions. The results are discussed in the light of policy and efficiency issues.

https://doi.org/10.1088/1742-6596/368/1/012016

012017

The following article is Open access

Can Go address the multicore issues of today and the manycore problems of tomorrow?

Sébastien Binet

View article, Can Go address the multicore issues of today and the manycore problems of tomorrow? PDF, Can Go address the multicore issues of today and the manycore problems of tomorrow?

Current High Energy and Nuclear Physics (HENP) libraries and frameworks were written before multicore systems became widely deployed and used. From this environment, a 'single-thread' processing model naturally emerged but the implicit assumptions it encouraged are greatly impairing our abilities to scale in a multicore/manycore world. While parallel programming - still in an intensive phase of R&D despite the 30+ years of literature on the subject - is an obvious topic to consider, other issues (build scalability, code clarity, code deployment and ease of coding) are worth investigating when preparing for the manycore era. Moreover, if one wants to use another language than C++, a language better prepared and tailored for expressing concurrency, one also needs to ensure a good and easy reuse of already field-proven libraries. We present the work resulting from such investigations applied to the Go programming language. We first introduce the concurrent programming facilities Go is providing and how its module system addresses the build scalability and dependency hell issues. We then describe the process of leveraging the many (wo)man-years put into scientific Fortran/C/C++ libraries and making them available to the Go ecosystem. The ROOT data analysis framework, the C-BLAS library and the Herwig-6 MonteCarlo generator will be taken as examples. Finally, performances of the tools involved in a small analysis written in Go and using ROOT I/O library will be presented.

https://doi.org/10.1088/1742-6596/368/1/012017

012018

The following article is Open access

Multicore in production: advantages and limits of the multiprocess approach in the ATLAS experiment

S Binet, P Calafiura, M K Jha, W Lavrijsen, C Leggett, D Lesny, H Severini, D Smith, S Snyder, M Tatarkhanov et al

View article, Multicore in production: advantages and limits of the multiprocess approach in the ATLAS experiment PDF, Multicore in production: advantages and limits of the multiprocess approach in the ATLAS experiment

The shared memory architecture of multicore CPUs provides HEP developers with the opportunity to reduce the memory footprint of their applications by sharing memory pages between the cores in a processor. ATLAS pioneered the multi-process approach to parallelize HEP applications. Using Linux fork() and the Copy On Write mechanism we implemented a simple event task farm, which allowed us to achieve sharing of almost 80% of memory pages among event worker processes for certain types of reconstruction jobs with negligible CPU overhead. By leaving the task of managing shared memory pages to the operating system, we have been able to parallelize large reconstruction and simulation applications originally written to be run in a single thread of execution with little to no change to the application code. The process of validating AthenaMP for production took ten months of concentrated effort and is expected to continue for several more months. Besides validating the software itself, an important and time-consuming aspect of running multicore applications in production was to configure the ATLAS distributed production system to handle multicore jobs. This entailed defining multicore batch queues, where the unit resource is not a core, but a whole computing node; monitoring the output of many event workers; and adapting the job definition layer to handle computing resources with different event throughputs. We will present scalability and memory usage studies, based on data gathered both on dedicated hardware and at the CERN Computer Center.

https://doi.org/10.1088/1742-6596/368/1/012018

012019

The following article is Open access

PROOF on the Cloud for ALICE using PoD and OpenNebula

D Berzano, S Bagnasco, R Brunetti and S Lusso

View article, PROOF on the Cloud for ALICE using PoD and OpenNebula PDF, PROOF on the Cloud for ALICE using PoD and OpenNebula

In order to optimize the use and management of computing centres, their conversion to cloud facilities is becoming increasingly popular. In a medium to large cloud facility, many different virtual clusters may concur for the same resources: unused resources can be freed either by turning off idle virtual machines, or by lowering resources assigned to a virtual machine at runtime. PROOF, a ROOT-based parallel and interactive analysis framework, is officially endorsed in the computing model of the ALICE experiment as complementary to the Grid, and it has become very popular over the last three years. The locality of PROOF-based analysis facilities forces system administrators to scavenge resources, yet the chaotic nature of user analysis tasks deems them unstable and inconstantly used, making PROOF a typical use-case for HPC cloud computing. Currently, PoD dynamically and easily provides a PROOF-enabled cluster by submitting agents to a job scheduler. Unfortunately, a Tier-2 does not comfortably share the same queue between interactive and batch jobs, due to the very large average time to completion of the latter: an elastic cloud approach would enable interactive virtual machines to temporarily subtract resources to the batch ones, without a noticeable impact on them. In this work we describe our setup of a dynamic PROOF-based cloud analysis facility based on PoD and OpenNebula, orchestrated by a simple and lightweight control daemon that makes virtualization transparent for the user.

https://doi.org/10.1088/1742-6596/368/1/012019

012020

The following article is Open access

The PROOF benchmark suite measuring PROOF performance

S Ryu and G Ganis

View article, The PROOF benchmark suite measuring PROOF performance PDF, The PROOF benchmark suite measuring PROOF performance

The PROOF benchmark suite is a new utility suite of PROOF to measure performance and scalability. The primary goal of the benchmark suite is to determine optimal configuration parameters for a set of machines to be used as PROOF cluster. The suite measures the performance of the cluster for a set of standard tasks as a function of the number of effective processes. Cluster administrators can use the suite to measure the performance of the cluster and find optimal configuration parameters. PROOF developers can also utilize the suite to help them measure, identify problems and improve their software. In this paper, the new tool is explained in detail and use cases are presented to illustrate the new tool.

https://doi.org/10.1088/1742-6596/368/1/012020

012021

The following article is Open access

An exploration of SciDB in the context of emerging technologies for data stores in particle physics and cosmology

D Malon, P van Gemmeren and J Weinstein

View article, An exploration of SciDB in the context of emerging technologies for data stores in particle physics and cosmology PDF, An exploration of SciDB in the context of emerging technologies for data stores in particle physics and cosmology

Traditional relational databases have not always been well matched to the needs of data-intensive sciences, but efforts are underway within the database community to attempt to address many of the requirements of large-scale scientific data management. One such effort is the open-source project SciDB. Since its earliest incarnations, SciDB has been designed for scalability in parallel and distributed environments, with a particular emphasis upon native support for array constructs and operations. Such scalability is of course a requirement of any strategy for large-scale scientific data handling, and array constructs are certainly useful in many contexts, but these features alone do not suffice to qualify a database product as an appropriate technology for hosting particle physics or cosmology data. In what constitutes its 1.0 release in June 2011, SciDB has extended its feature set to address additional requirements of scientific data, with support for user-defined types and functions, for data versioning, and more. This paper describes an evaluation of the capabilities of SciDB for two very different kinds of physics data: event-level metadata records from proton collisions at the Large Hadron Collider (LHC), and the output of cosmological simulations run on very-large-scale supercomputers. This evaluation exercises the spectrum of SciDB capabilities in a suite of tests that aim to be representative and realistic, including, for example, definition of four-vector data types and natural operations thereon, and computational queries that match the natural use cases for these data.

https://doi.org/10.1088/1742-6596/368/1/012021

012022

The following article is Open access

One click dataset transfer: toward efficient coupling of distributed storage resources and CPUs

Michal Zerola, Jérôme Lauret, Roman Barták and Michal Šumbera

View article, One click dataset transfer: toward efficient coupling of distributed storage resources and CPUs PDF, One click dataset transfer: toward efficient coupling of distributed storage resources and CPUs

The massive data processing in a multi-collaboration environment with geographically spread diverse facilities will be hardly "fair" to users and hardly using network bandwidth efficiently unless we address and deal with planning and reasoning related to data movement and placement. The needs for coordinated data resource sharing and efficient plans solving the data transfer paradigm in a dynamic way are being more required. We will present the work which purpose is to design and develop an automated planning system acting as a centralized decision making component with emphasis on optimization, coordination and load-balancing.

We will describe the most important optimization characteristic and modeling approach based on "constraints". Constraint-based approach allows for a natural declarative formulation of what must be satisfied, without expressing how. The architecture of the system, communication between components and execution of the plan by underlying data transfer tools will be shown. We will emphasize the separation of the planner from the "executors" and explain how to keep the proper balance between being deliberative and reactive. The extension of the model covering full coupling and reasoning about computing resources will be shown.

The system has been deployed within STAR experiment over several Tier sites and has been used for data movement in the favour of user analyses or production processing. We will present several real use-case scenario and performance of the system with a comparison to the "traditional" - solved by hands methods. The benefits in terms of indispensable shorter data delivery time due to leveraging available network paths and intermediate caches will be revealed. Finally, we will outline several possible enhancements and avenues for future work.

https://doi.org/10.1088/1742-6596/368/1/012022

012023

The following article is Open access

Evaluation of likelihood functions on CPU and GPU devices

Sverre Jarp, Alfio Lazzaro, Julien Leduc, Andrzej Nowak and Yngve Sneen Lindal

View article, Evaluation of likelihood functions on CPU and GPU devices PDF, Evaluation of likelihood functions on CPU and GPU devices

We describe parallel implementations of an algorithm used to evaluate the likelihood function used in data analysis. The implementations run, respectively, on CPU and GPU, and both devices cooperatively (hybrid). CPU and GPU implementations are based on OpenMP and OpenCL, respectively. The hybrid implementation allows the application to run also on multi-GPU systems (not necessarily of the same type). The hybrid case uses a scheduler so that the workload needed for the evaluation of function is split and balanced in corresponding sub-workloads to be executed in parallel on each device, i. e. CPU-GPU or multi-CPUs. We present the results of the scalability when running on CPU. Then we show the comparison of the performance of the GPU implementation on different hardware systems from different vendors, and the performance when running in the hybrid case. The tests are based on likelihood functions from real data analysis carried out in the high energy physics community.

https://doi.org/10.1088/1742-6596/368/1/012023

012024

The following article is Open access

Efficient pseudo-random number generation for monte-carlo simulations using graphic processors

Siddhant Mohanty, A K Mohanty and F Carminati

View article, Efficient pseudo-random number generation for monte-carlo simulations using graphic processors PDF, Efficient pseudo-random number generation for monte-carlo simulations using graphic processors

A hybrid approach based on the combination of three Tausworthe generators and one linear congruential generator for pseudo random number generation for GPU programing as suggested in NVIDIA-CUDA library has been used for MONTE-CARLO sampling. On each GPU thread, a random seed is generated on fly in a simple way using the quick and dirty algorithm where mod operation is not performed explicitly due to unsigned integer overflow. Using this hybrid generator, multivariate correlated sampling based on alias technique has been carried out using both CUDA and OpenCL languages.

https://doi.org/10.1088/1742-6596/368/1/012024

012025

The following article is Open access

Challenges in using GPUs for the reconstruction of digital hologram images

I D Reid, J J Nebrensky and P R Hobson

View article, Challenges in using GPUs for the reconstruction of digital hologram images PDF, Challenges in using GPUs for the reconstruction of digital hologram images

In-line holographic imaging is used for small particulates, such as cloud or spray droplets, marine plankton, and alluvial sediments, and enables a true 3D object field to be recorded at high resolution over a considerable depth. To reconstruct a digital hologram a 2D FFT must be calculated for every depth slice desired in the replayed image volume. A typical in-line hologram of ∼ 100 micrometre-sized particles over a depth of a few hundred millimetres will require O(1000) 2D FFT operations to be performed on an hologram of typically a few million pixels. In previous work we have reported on our experiences with reconstruction on a computational grid. In this paper we discuss the technical challenges in making efficient use of the NVIDIA Tesla and Fermi GPU systems and show how our reconstruction code was optimised for near real-time video slice reconstruction with holograms as large as 4K by 4K pixels. We also consider the implications for grid and cloud computing approaches to hologram replay, and the extent to which a GPU can replace these approaches, when the important step of locating focussed objects within a reconstructed volume is included.

https://doi.org/10.1088/1742-6596/368/1/012025

012026

The following article is Open access

Data preservation in High Energy Physics

R Kogler, D M South, M Steder and (On behalf ofthe ICFA DPHEP Study Group)

View article, Data preservation in High Energy Physics PDF, Data preservation in High Energy Physics

Data from high-energy physics experiments are collected with significant financial and human effort and are mostly unique. However, until recently no coherent strategy existed for data preservation and re-use, and many important and complex data sets have simply been lost. While the current focus is on the LHC at CERN, in the current period several important and unique experimental programs at other facilities are coming to an end, including those at HERA, b-factories and the Tevatron. To address this issue, an inter-experimental study group on HEP data preservation and long-term analysis (DPHEP) was convened at the end of 2008. The group now aims to publish a full and detailed review of the present status of data preservation in high energy physics. This contribution summarises the results of the DPHEP study group, describing the challenges of data preservation in high energy physics and the group's first conclusions and recommendations. The physics motivation for data preservation, generic computing and preservation models, technological expectations and governance aspects at local and international levels are examined.

https://doi.org/10.1088/1742-6596/368/1/012026

012027

The following article is Open access

A validation system for data preservation in HEP

Yves Kemp, Marco Strutz and Hermann Hessling

View article, A validation system for data preservation in HEP PDF, A validation system for data preservation in HEP

Preserving data from past experiments and preserving the ability to perform analysis with old data is of growing importance in many domains of science, including High Energy Physics (HEP). A study group on this issue, DPHEP, has been established in this field to provide guidelines and a structure for international collaboration on data preservation projects in HEP. This contribution presents a framework that allows experimentalists to validate their software against a previously defined set of tests in an automated way. The framework has been designed with a special focus for longevity, as it makes use of open protocols, has a modular design and is based on simple communication mechanisms. On the fabrics side, tests are carried out in a virtual environment using a cloud infrastructure. Within the framework, it is easy to run validation tests on different hardware platforms, or different major or minor versions of operating systems. Experts from IT or the experiments can automatically detect failures in the test procedure by the help of reporting tools. Hence, appropriate actions can be taken in a timely manner. The design and important implementation aspects of the framework are shown and first experiences from early-bird users will be presented.

https://doi.org/10.1088/1742-6596/368/1/012027

012028

The following article is Open access

Advanced event reweighting using multivariate analysis

D Martschei, M Feindt, S Honc and J Wagner-Kuhr

View article, Advanced event reweighting using multivariate analysis PDF, Advanced event reweighting using multivariate analysis

Multivariate analysis (MVA) methods, especially discrimination techniques such as neural networks, are key ingredients in modern data analysis and play an important role in high energy physics. They are usually trained on simulated Monte Carlo (MC) samples to discriminate so called "signal" from "background" events and are then applied to data to select real events of signal type. We here address procedures that improve this work flow. This will be the enhancement of data / MC agreement by reweighting MC samples on a per event basis. Then training MVAs on real data using the sPlot technique will be discussed. Finally we will address the construction of MVAs whose discriminator is independent of a certain control variable, i.e. cuts on this variable will not change the discriminator shape.

https://doi.org/10.1088/1742-6596/368/1/012028

012029

The following article is Open access

Tau identification using multivariate techniques in ATLAS

D C O'Neil and (On behalf ofthe ATLAS Collaboration)

View article, Tau identification using multivariate techniques in ATLAS PDF, Tau identification using multivariate techniques in ATLAS

Tau leptons play an important role in the physics program of the LHC. They are being used in electroweak measurements, in detector related studies and in searches for new phenomena like the Higgs boson or Supersymmetry. In the detector, tau leptons are reconstructed as collimated jets with low track multiplicity. Due to the background from QCD multijet processes, efficient tau identification techniques with large fake rejection are essential. Since single variable criteria are not enough to efficiently separate them from jets and electrons, modern multivariate techniques are used. In ATLAS, several advanced algorithms are applied to identify taus, including a projective likelihood estimator and boosted decision trees. All multivariate methods applied to the ATLAS simulated data perform better than the baseline cut analysis. Their performance is shown using high energy data collected at the ATLAS experiment. The improvement ranges from a factor of 2 to 5 in rejection for the same efficiency, depending on the selected efficiency operating point and the number of prongs in the tau decay. The strengths and weaknesses of each technique are also discussed.

https://doi.org/10.1088/1742-6596/368/1/012029

012030

The following article is Open access

Online particle detection with Neural Networks based on topological calorimetry information

T Ciodaro, D Deva, J M de Seixas and D Damazio

View article, Online particle detection with Neural Networks based on topological calorimetry information PDF, Online particle detection with Neural Networks based on topological calorimetry information

This paper presents the latest results from the Ringer algorithm, which is based on artificial neural networks for the electron identification at the online filtering system of the ATLAS particle detector, in the context of the LHC experiment at CERN. The algorithm performs topological feature extraction using the ATLAS calorimetry information (energy measurements). The extracted information is presented to a neural network classifier. Studies showed that the Ringer algorithm achieves high detection efficiency, while keeping the false alarm rate low. Optimizations, guided by detailed analysis, reduced the algorithm execution time by 59%. Also, the total memory necessary to store the Ringer algorithm information represents less than 6.2 percent of the total filtering system amount.

https://doi.org/10.1088/1742-6596/368/1/012030

012031

The following article is Open access

A population-based approach to background discrimination in particle physics

Federico Colecchia

View article, A population-based approach to background discrimination in particle physics PDF, A population-based approach to background discrimination in particle physics

Background properties in experimental particle physics are typically estimated from control samples corresponding to large numbers of events. This can provide precise knowledge of average background distributions, but typically does not take into account statistical fluctuations in a data set of interest. A novel approach based on mixture model decomposition is presented, as a way to extract additional information about statistical fluctuations from a given data set with a view to improving on knowledge of background distributions obtained from control samples. Events are treated as heterogeneous populations comprising particles originating from different processes, and individual particles are mapped to a process of interest on a probabilistic basis. The proposed approach makes it possible to estimate features of the background distributions from the data, and to extract information about statistical fluctuations that would otherwise be lost using traditional supervised classifiers trained on high-statistics control samples. A feasibility study on Monte Carlo is presented, together with a comparison with existing techniques. Finally, the prospects for the development of tools for intensive offline analysis of individual interesting events at the Large Hadron Collider are discussed.

https://doi.org/10.1088/1742-6596/368/1/012031

012032

The following article is Open access

Semi-supervised anomaly detection – towards model-independent searches of new physics

Mikael Kuusela, Tommi Vatanen, Eric Malmi, Tapani Raiko, Timo Aaltonen and Yoshikazu Nagai

View article, Semi-supervised anomaly detection – towards model-independent searches of new physics PDF, Semi-supervised anomaly detection – towards model-independent searches of new physics

Most classification algorithms used in high energy physics fall under the category of supervised machine learning. Such methods require a training set containing both signal and background events and are prone to classification errors should this training data be systematically inaccurate for example due to the assumed MC model. To complement such model-dependent searches, we propose an algorithm based on semi-supervised anomaly detection techniques, which does not require a MC training sample for the signal data. We first model the background using a multivariate Gaussian mixture model. We then search for deviations from this model by fitting to the observations a mixture of the background model and a number of additional Gaussians. This allows us to perform pattern recognition of any anomalous excess over the background. We show by a comparison to neural network classifiers that such an approach is a lot more robust against misspecification of the signal MC than supervised classification. In cases where there is an unexpected signal, a neural network might fail to correctly identify it, while anomaly detection does not suffer from such a limitation. On the other hand, when there are no systematic errors in the training data, both methods perform comparably.

https://doi.org/10.1088/1742-6596/368/1/012032

012033

The following article is Open access

Real time algorithms in the ATLAS tau trigger system at 7 TeV center of mass energy

P Kadlečík and (On behalf ofthe ATLAS Collaboration)

View article, Real time algorithms in the ATLAS tau trigger system at 7 TeV center of mass energy PDF, Real time algorithms in the ATLAS tau trigger system at 7 TeV center of mass energy

The ATLAS hadronic tau trigger plays an important role in many analyses. Among these analyses are searches for H⁰, H^±, W^', and Z^' in the tau decay channel. In order to achieve the needed sensitivity in these measurement it is important to reduce the QCD background, but at the same time to keep the signal efficiency high. Furthermore it is important to understand the trigger efficiency in real data. This paper summarizes the performance of the tau trigger in data collected by the ATLAS detector in 2011.

https://doi.org/10.1088/1742-6596/368/1/012033

012034

The following article is Open access

Track reconstruction and b-jet identification for the ATLAS trigger system

A Coccaro and (On behalf ofthe ATLAS Collaboration)

View article, Track reconstruction and b-jet identification for the ATLAS trigger system PDF, Track reconstruction and b-jet identification for the ATLAS trigger system

A sophisticated trigger system, capable of real-time track reconstruction, is used in the ATLAS experiment to select interesting events in the proton-proton collisions at the Large Hadron Collider at CERN. A set of b-jet triggers was activated in ATLAS for the entire 2011 data-taking campaign and successfully selected events enriched in jets arising from heavy-flavour quarks. Such triggers were demonstrated to be crucial for the selection of events with no lepton signature and a large jet multiplicity. An overview of the track reconstruction and online b-jet selection with performance estimates from data is presented in these proceedings.

https://doi.org/10.1088/1742-6596/368/1/012034

012035

The following article is Open access

Alignment of the ATLAS inner detector

Jike Wang and (On behalf ofthe ATLAS Collaboration)

View article, Alignment of the ATLAS inner detector PDF, Alignment of the ATLAS inner detector

In order to reach the track parameter accuracy motivated by the physics goals of the experiment, the ATLAS tracking system needs to determine accurately its almost 700,000 degrees of freedom. The demanded precision for the alignment of the silicon sensors is below 10 μm. The implementation of the track based alignment within the ATLAS software framework unifies different alignment approaches and allows the alignment of all tracking subsystems together. The alignment software relies on the tracking information (track-hit residuals) but also includes the capability to set constraints on the beam-spot and primary vertex as well as the momentum measured by the Muon System or the E/p using the calorimetry information. The alignment chain starts at the trigger level where a stream of high p_T isolated tracks is selected online. Also a cosmic ray trigger is enabled while ATLAS is recording collision data, thus a stream of cosmic-ray tracks is recorded exactly with the same detector operating conditions as the normal collision tracks. We will present results of the alignment of the ATLAS tracker using the 2011 collision data. The validation of the alignment is performed using track-hit residuals as well as using more advanced physics observables. The results of the alignment with real data reveals that the attained precision for the alignment parameters is approximately 5 μm.

https://doi.org/10.1088/1742-6596/368/1/012035

012036

The following article is Open access

Alignment of the CMS silicon tracker

Gero Flucke and (On behalf ofthe CMS Collaboration)

View article, Alignment of the CMS silicon tracker PDF, Alignment of the CMS silicon tracker

The CMS all-silicon tracker consists of 16 588 modules with 25 684 sensors in total. In 2010 it has been successfully aligned using tracks from cosmic rays and proton-proton collisions, following the time dependent movements of its innermost pixel layers. In 2011, ultimate local precision is achieved by determining sensor curvatures in addition to module shifts and rotations, challenging the alignment procedure to determine about 200 000 parameters. This is achieved in a global fit approach using Millepede II with the General Broken Lines track model. Remaining alignment uncertainties are dominated by systematic effects that bias track parameters by an amount relevant for physics analyses. These effects are controlled by including information about the Z boson mass in the fit.

https://doi.org/10.1088/1742-6596/368/1/012036

012037

The following article is Open access

An alternative method for the TileCal signal detection and amplitude estimation

B Sotto-Maior Peralva and (On behalf ofthe ATLAS Collaboration)

View article, An alternative method for the TileCal signal detection and amplitude estimation PDF, An alternative method for the TileCal signal detection and amplitude estimation

The Tile Barrel Calorimeter (TileCal) is the central section of the hadronic calorimeter of ATLAS. It is a key detector for the reconstruction of hadrons, jets, taus and missing transverse energy and it assists the muon measurements due to a low signal-to-noise ratio. The energy deposited in each cell is read out by two electronic channels for redundancy and is estimated by reconstructing the amplitude of the digitized signal pulse sampled every 25 ns. This work presents an alternative approach for TileCal signal detection and amplitude estimation under low signal-to-noise ratio (SNR) conditions, exploring the applicability of a Matched Filter. The proposed method is compared to the Optimal Filter algorithm, that is currently being used at TileCal for energy reconstruction. The results for a simulated data set showed that for conditions where the signal pedestal could be considered stationary, the proposed method achieves a better SNR performance than the Optimal Filter technique.

https://doi.org/10.1088/1742-6596/368/1/012037

012038

The following article is Open access

Fractal dimension analysis in a highly granular calorimeter

M Ruan, V Boudry, J -C Brient, D Jeans and H Videau

View article, Fractal dimension analysis in a highly granular calorimeter PDF, Fractal dimension analysis in a highly granular calorimeter

The concept of "particle flow" has been developed to optimise the jet energy resolution by distinguishing the different jet components. A highly granular calorimeter designed for the particle flow algorithm provides an unprecedented level of detail for the reconstruction of calorimeter showers and enables new approaches to shower analysis. In this paper the measurement and use of the fractal dimension of showers is described. The fractal dimension is a characteristic number that measures the global compactness of the shower. It is highly dependent on the primary particle type and energy. Its application in identifying particles and estimating their energy is described in the context of a calorimeter designed for the International Linear Collider.

https://doi.org/10.1088/1742-6596/368/1/012038

012039

The following article is Open access

Visual physics analysis – from desktop to physics analysis at your fingertips

H -P Bretz, M Erdmann, R Fischer, A Hinzmann, D Klingebiel, M Komm, J Lingemann, M Rieger, G Müller, J Steggemann et al

View article, Visual physics analysis – from desktop to physics analysis at your fingertips PDF, Visual physics analysis – from desktop to physics analysis at your fingertips

Visual Physics Analysis (VISPA) is an analysis environment with applications in high energy and astroparticle physics. Based on a data-flow-driven paradigm, it allows users to combine graphical steering with self-written C++ and Python modules. This contribution presents new concepts integrated in VISPA: layers, convenient analysis execution, and web-based physics analysis. While the convenient execution offers full flexibility to vary settings for the execution phase of an analysis, layers allow to create different views of the analysis already during its design phase. Thus, one application of layers is to define different stages of an analysis (e.g. event selection and statistical analysis). However, there are other use cases such as to independently optimize settings for different types of input data in order to guide all data through the same analysis flow. The new execution feature makes job submission to local clusters as well as the LHC Computing Grid possible directly from VISPA. Web-based physics analysis is realized in the VISPA@Web project, which represents a whole new way to design and execute analyses via a standard web browser.

https://doi.org/10.1088/1742-6596/368/1/012039

012040

The following article is Open access

Druid, displaying root module used for linear collider detectors

M Ruan

View article, Druid, displaying root module used for linear collider detectors PDF, Druid, displaying root module used for linear collider detectors

Based on the ROOT TEve/TGeo classes and the standard linear collider data structure, a dedicated linear collider event display has been developed. It supports the latest detector models for both International Linear Collider and Compact Linear Collider as well as the CALICE test beam prototypes. It can be used to visualise event information at the generation, simulation and reconstruction levels. Many options are provided in an intuitive interface. It has been heavily employed in a variety of analyses.

https://doi.org/10.1088/1742-6596/368/1/012040

012041

The following article is Open access

Green's function based unparameterised multi-dimensional kernel density and likelihood ratio estimator

P Kövesárki, I C Brock and A E Nuncio Quiroz

View article, Green's function based unparameterised multi-dimensional kernel density and likelihood ratio estimator PDF, Green's function based unparameterised multi-dimensional kernel density and likelihood ratio estimator

This paper introduces a probability density estimator based on Green's function identities. A density model is constructed under the sole assumption that the probability density is differentiable. The method is implemented as a binary likelihood estimator for classification purposes, so issues such as mis-modeling and overtraining are also discussed. The identity behind the density estimator can be interpreted as a real-valued, non-scalar kernel method which is able to reconstruct differentiable density functions.

https://doi.org/10.1088/1742-6596/368/1/012041

012042

The following article is Open access

Continuous simulation of hypothetical physics processes with multiple free parameters

J Zhong and S -C Lee

View article, Continuous simulation of hypothetical physics processes with multiple free parameters PDF, Continuous simulation of hypothetical physics processes with multiple free parameters

We present a new approach to simulate Beyond-Standard-Model (BSM) processes which are defined by multiple parameters. In contrast to the traditional grid-scan method where a large number of events are simulated at each point of a sparse grid in the parameter space, this new approach simulates only a few events at each of a selected number of points distributed randomly over the whole parameter space. In subsequent analysis, we rely on the fitting by the Bayesian Neural Network (BNN) technique to obtain accurate estimation of the acceptance distribution. With this new approach, the signal yield can be estimated continuously, while the required number of simulation events is greatly reduced.

https://doi.org/10.1088/1742-6596/368/1/012042

012043

The following article is Open access

A linear iterative unfolding method

András László

View article, A linear iterative unfolding method PDF, A linear iterative unfolding method

A frequently faced task in experimental physics is to measure the probability distribution of some quantity. Often this quantity to be measured is smeared by a non-ideal detector response or by some physical process. The procedure of removing this smearing effect from the measured distribution is called unfolding, and is a delicate problem in signal processing, due to the well-known numerical ill behavior of this task. Various methods were invented which, given some assumptions on the initial probability distribution, try to regularize the unfolding problem. Most of these methods definitely introduce bias into the estimate of the initial probability distribution. We propose a linear iterative method (motivated by the Neumann series / Landweber iteration known in functional analysis), which has the advantage that no assumptions on the initial probability distribution is needed, and the only regularization parameter is the stopping order of the iteration, which can be used to choose the best compromise between the introduced bias and the propagated statistical and systematic errors. The method is consistent: "binwise" convergence to the initial probability distribution is proved in absence of measurement errors under a quite general condition on the response function. This condition holds for practical applications such as convolutions, calorimeter response functions, momentum reconstruction response functions based on tracking in magnetic field etc. In presence of measurement errors, explicit formulae for the propagation of the three important error terms is provided: bias error (distance from the unknown to-be-reconstructed initial distribution at a finite iteration order), statistical error, and systematic error. A trade-off between these three error terms can be used to define an optimal iteration stopping criterion, and the errors can be estimated there. We provide a numerical C library for the implementation of the method, which incorporates automatic statistical error propagation as well. The proposed method is also discussed in the context of other known approaches.

https://doi.org/10.1088/1742-6596/368/1/012043

012044

The following article is Open access

An adaptive Monte-Carlo Markov chain algorithm for inference from mixture signals

Rémi Bardenet and Balázs Kégl

View article, An adaptive Monte-Carlo Markov chain algorithm for inference from mixture signals PDF, An adaptive Monte-Carlo Markov chain algorithm for inference from mixture signals

Adaptive Metropolis (AM) is a powerful recent algorithmic tool in numerical Bayesian data analysis. AM builds on a well-known Markov Chain Monte Carlo algorithm but optimizes the rate of convergence to the target distribution by automatically tuning the design parameters of the algorithm on the fly. Label switching is a major problem in inference on mixture models because of the invariance to symmetries. The simplest (non-adaptive) solution is to modify the prior in order to make it select a single permutation of the variables, introducing an identifiability constraint. This solution is known to cause artificial biases by not respecting the topology of the posterior. In this paper we describe an online relabeling procedure which can be incorporated into the AM algorithm. We give elements of convergence of the algorithm and identify the link between its modified target measure and the original posterior distribution of interest. We illustrate the algorithm on a synthetic mixture model inspired by the muonic water Cherenkov signal of the surface detectors in the Pierre Auger Experiment.

https://doi.org/10.1088/1742-6596/368/1/012044

012045

The following article is Open access

Two dimensional correlated sampling using alias technique

S Mohanty, S Banerjee, J Jose, D Goyal, A K Mohanty and F Carminati

View article, Two dimensional correlated sampling using alias technique PDF, Two dimensional correlated sampling using alias technique

Monte-Carlo sampling of two dimensional correlated variables (with non zero covariance) has been carried out using an extended alias technique which was originally proposed by A. J. Walker to sample from an one dimensional distribution. Although, the method has been applied to a correlated two dimensional Gaussian data sample, it is quite general and can easily be extended for sampling from a multidimensional correlated data sample of any arbitrary distribution.

https://doi.org/10.1088/1742-6596/368/1/012045

012046

The following article is Open access

Non-parametric comparison of histogrammed two-dimensional data distributions using the Energy Test

Ivan D Reid, Raul H C Lopes and Peter R Hobson

View article, Non-parametric comparison of histogrammed two-dimensional data distributions using the Energy Test PDF, Non-parametric comparison of histogrammed two-dimensional data distributions using the Energy Test

When monitoring complex experiments, comparison is often made between regularly acquired histograms of data and reference histograms which represent the ideal state of the equipment. With the larger HEP experiments now ramping up, there is a need for automation of this task since the volume of comparisons could overwhelm human operators. However, the two-dimensional histogram comparison tools available in ROOT have been noted in the past to exhibit shortcomings. We discuss a newer comparison test for two-dimensional histograms, based on the Energy Test of Aslan and Zech, which provides more conclusive discrimination between histograms of data coming from different distributions than methods provided in a recent ROOT release.

https://doi.org/10.1088/1742-6596/368/1/012046

012047

The following article is Open access

Off-line data processing and analysis for the GERDA experiment

M Agostini, L Pandola and P Zavarise

View article, Off-line data processing and analysis for the GERDA experiment PDF, Off-line data processing and analysis for the GERDA experiment

Gerda is an experiment designed to look for the neutrinoless double beta decay of ⁷⁶Ge. The experiment uses an array of high-purity germanium detectors (enriched in ⁷⁶Ge) directly immersed in liquid argon. Gerda is presently operating eight enriched coaxial detectors (approximately 15 kg of ⁷⁶Ge) and about 25 new custom-made enriched BEGe detectors will be deployed in the next phase (additional 20kg of ⁷⁶Ge). The paper describes the Gerda off-line analysis of the high-purity germanium detector data. Firstly we present the signal processing flow, focusing on the digital filters and on the algorithms used. Secondly we discuss the rejection of non-physical events and the data quality monitoring. The analysis is performed completely with the Gerda software framework (Gelatio), designed to support a multi-channel processing and to perform a modular analysis of digital signals.

https://doi.org/10.1088/1742-6596/368/1/012047

012048

The following article is Open access

Ten years of Object-Oriented analysis on H1

Paul Laycock

View article, Ten years of Object-Oriented analysis on H1 PDF, Ten years of Object-Oriented analysis on H1

Over a decade ago, the H1 Collaboration decided to embrace the object-oriented paradigm and completely redesign its data analysis model and data storage format. The event data model, based on the ROOT framework, consists of three layers - tracks and calorimeter clusters, identified particles and finally event summary data - with a singleton class providing unified access. This original solution was then augmented with a fourth layer containing user-defined objects. This contribution will summarise the history of the solutions used, from modifications to the original design, to the evolution of the high-level end-user analysis object framework which is used by H1 today. Several important issues are addressed - the portability of expert knowledge to increase the efficiency of data analysis, the flexibility of the framework to incorporate new analyses, the performance and ease of use, and lessons learned for future projects.

https://doi.org/10.1088/1742-6596/368/1/012048

012049

The following article is Open access

The toolbox of modern multi-loop calculations: novel analytic and semi-analytic techniques

Alexey Pak

View article, The toolbox of modern multi-loop calculations: novel analytic and semi-analytic techniques PDF, The toolbox of modern multi-loop calculations: novel analytic and semi-analytic techniques

We describe three algorithms for computer-aided symbolic multi-loop calculations that facilitated some recent novel results. First, we discuss an algorithm to derive the canonical form of an arbitrary Feynman integral in order to facilitate their identification. Second, we present a practical solution to the problem of multi-loop analytical tensor reduction. Finally, we discuss the partial fractioning of polynomials with external linear relations between the variables. All algorithms have been tested and used in real calculations.

https://doi.org/10.1088/1742-6596/368/1/012049

012050

The following article is Open access

DRA method: Powerful tool for the calculation of the loop integrals

R N Lee

View article, DRA method: Powerful tool for the calculation of the loop integrals PDF, DRA method: Powerful tool for the calculation of the loop integrals

We review the method of the calculation of multiloop integrals suggested in [1].

https://doi.org/10.1088/1742-6596/368/1/012050

012051

The following article is Open access

SecDec: A tool for numerical mult i-loop/leg calculations

S Borowka, J Carter and G Heinrich

View article, SecDec: A tool for numerical mult i-loop/leg calculations PDF, SecDec: A tool for numerical mult i-loop/leg calculations

The new version of the program SecDec is described, which can be used for the extraction of poles within dimensional regularisation from multi-loop integrals as well as phase space integrals. The numerical evaluation of the resulting finite functions is also done by the program in an automated way, with no restriction on the kinematics in the case of loop integrals.

https://doi.org/10.1088/1742-6596/368/1/012051

012052

The following article is Open access

Multiloop calculations in supersymmetric theories with the higher covariant derivative regularization

K V Stepanyantz

View article, Multiloop calculations in supersymmetric theories with the higher covariant derivative regularization PDF, Multiloop calculations in supersymmetric theories with the higher covariant derivative regularization

Most calculations of quantum corrections in supersymmetric theories are made with the dimensional reduction, which is a modification of the dimensional regularization. However, it is well known that the dimensional reduction is not self-consistent. A consistent regularization, which does not break the supersymmetry, is the higher covariant derivative regularization. However, the integrals obtained with this regularization can not be usually calculated analytically. We discuss application of this regularization to the calculations in supersymmetric theories. In particular, it is demonstrated that integrals defining the β-function are possibly integrals of total derivatives. This feature allows to explain the origin of the exact NSVZ β-function, relating the β-function with the anomalous dimensions of the matter superfields. However, integrals for the anomalous dimension should be calculated numerically.

https://doi.org/10.1088/1742-6596/368/1/012052

012053

The following article is Open access

Three-loop calculation of the higgs boson mass in supersymmetry

Philipp Kant

View article, Three-loop calculation of the higgs boson mass in supersymmetry PDF, Three-loop calculation of the higgs boson mass in supersymmetry

A Key feature of the minimal supersymmetric extension of the Standard Model (mssm) is the existence of a light Higgs boson, the mass of which is not a free parameter but an observable that can be predicted from the theory. Given that the lhc is able to measure the mass of a light Higgs with very good accuracy, a lot of effort has been put into a precise theoretical prediction. We present a calculation of the susy-qcd corrections to this observable to three-loop order. We perform multiple asymptotic expansions in order to deal with the multi-scale three-loop diagrams, making heavy use of computer algebra and keeping a keen eye on the numerical error introduced. We provide a computer code in the form of a Mathematica package that combines our three-loop susy-qcd calculation with the literature of one- and two-loop corrections to the Higgs mass, providing a state-of-the-art prediction for this important observable.

https://doi.org/10.1088/1742-6596/368/1/012053

012054

The following article is Open access

FormCalc 7

S Agrawal, T Hahn and E Mirabella

View article, FormCalc 7 PDF, FormCalc 7

We present additions and improvements in Version 7 of FormCalc, most notably analytic tensor reduction, choice of OPP methods, and MSSM initialization via FeynHiggs, as well as a parallelized Cuba library for numerical integration.

https://doi.org/10.1088/1742-6596/368/1/012054

012055

The following article is Open access

Numerical evaluation of one-loop QCD amplitudes

Simon Badger, Benedikt Biedermann and Peter Uwer

View article, Numerical evaluation of one-loop QCD amplitudes PDF, Numerical evaluation of one-loop QCD amplitudes

We present the publicly available program NGluon allowing the numerical evaluation of primitive amplitudes at one-loop order in massless QCD. The program allows the computation of one-loop amplitudes for an arbitrary number of gluons. The focus of the present article is the extension to one-loop amplitudes including an arbitrary number of massless quark pairs. We discuss in detail the algorithmic differences to the pure gluonic case and present cross checks to validate our implementation. The numerical accuracy is investigated in detail.

https://doi.org/10.1088/1742-6596/368/1/012055

012056

The following article is Open access

GoSam: A program for automated one-loop calculations

G Cullen, N Greiner, G Heinrich, G Luisoni, P Mastrolia, G Ossola, T Reiter and F Tramontano

View article, GoSam: A program for automated one-loop calculations PDF, GoSam: A program for automated one-loop calculations

The program package GoSam is presented which aims at the automated calculation of one-loop amplitudes for multi-particle processes. The amplitudes are generated in terms of Feynman diagrams and can be reduced using either D-dimensional integrand-level decomposition or tensor reduction, or a combination of both. GoSam can be used to calculate one-loop corrections to both QCD and electroweak theory, and model files for theories Beyond the Standard Model can be linked as well. A standard interface to programs calculating real radiation is also included. The flexibility of the program is demonstrated by various examples.

https://doi.org/10.1088/1742-6596/368/1/012056

012057

The following article is Open access

One-loop tensor Feynman integral reduction with signed minors

Jochem Fleischer, Tord Riemann and Valery Yundin

View article, One-loop tensor Feynman integral reduction with signed minors PDF, One-loop tensor Feynman integral reduction with signed minors

We present an algebraic approach to one-loop tensor integral reduction. The integrals are presented in terms of scalar one- to four-point functions. The reduction is worked out explicitly until five-point functions of rank five. The numerical C++ package PJFry evaluates tensor coefficients in terms of a basis of scalar integrals, which is provided by an external library, e.g. QCDLoop. We shortly describe installation and use of PJFry. Examples for numerical results are shown, including a special treatment for small or vanishing inverse four-point Gram determinants. An extremely efficient application of the formalism is the immediate evaluation of complete contractions of the tensor integrals with external momenta. This leads to the problem of evaluating sums over products of signed minors with scalar products of chords. Chords are differences of external momenta. These sums may be evaluated analytically in a systematic way. The final expressions for the numerical evaluation are then compact combinations of the contributing basic scalar functions.

https://doi.org/10.1088/1742-6596/368/1/012057

012058

The following article is Open access

Progress in automated Next-to-Leading-Order calculations

Francesco Tramontano

View article, Progress in automated Next-to-Leading-Order calculations PDF, Progress in automated Next-to-Leading-Order calculations

We review the recent progress towards automation in the computation of the next-to-leading corrections to scattering amplitudes. Such progress allows for the construction of quite general, flexible and fully automated packages that would be of major importance for the Higgs boson and beyond the Standard Model physics searches at high energy particle colliders.

https://doi.org/10.1088/1742-6596/368/1/012058

012059

The following article is Open access

Regularization schemes and higher order corrections

William B Kilgore

View article, Regularization schemes and higher order corrections PDF, Regularization schemes and higher order corrections

I apply commonly used regularization schemes to a multiloop calculation to examine the properties of the schemes at higher orders. I find complete consistency between the conventional dimensional regularization scheme and dimensional reduction, but I find that the four-dimensional helicity scheme produces incorrect results at next-to-next-to-leading order and singular results at next-to-next-to-next-to-leading order. It is not, therefore, a unitary regularization scheme.

https://doi.org/10.1088/1742-6596/368/1/012059

012060

The following article is Open access

Regularization of IR divergent loop integrals

E de Doncker, F Yuasa and Y Kurihara

View article, Regularization of IR divergent loop integrals PDF, Regularization of IR divergent loop integrals

We report results of a new numerical regularization technique for infrared (IR) divergent loop integrals using dimensional regularization, where a positive regularization parameter ε, satisfying that the dimension d = 4 + 2ε, is introduced in the integrand to keep the integral from diverging as long as ε > 0. A sequence of integrals is computed for decreasing values of ε, in order to carry out a linear extrapolation as ε → 0. Each integral in the sequence is calculated according to the Direct Computation Method (DCM) to handle (threshold) integrand singularities in the interior of the domain. The technique of this paper is applied to one-loop N-point functions. In order to simplify the computation of the integrals for small ε, particularly in the case of a threshold singularity, a reduction of the N-point function is performed numerically to a set of 3-point and 4-point integrals, and DCM is applied to the resulting vertex and box integrals.

https://doi.org/10.1088/1742-6596/368/1/012060

012061

The following article is Open access

Polynomial Algebra in Form 4

J Kuipers

View article, Polynomial Algebra in Form 4 PDF, Polynomial Algebra in Form 4

New features of the symbolic algebra package Form 4 are discussed. Most importantly, these features include polynomial factorization and polynomial gcd computation. Examples of their use are shown. One of them is an exact version of Mincer which gives answers in terms of rational polynomials and 5 master integrals.

https://doi.org/10.1088/1742-6596/368/1/012061

012062

The following article is Open access

GPU Linear algebra extensions for GNU/Octave

L B Bosi, M Mariotti and A Santocchia

View article, GPU Linear algebra extensions for GNU/Octave PDF, GPU Linear algebra extensions for GNU/Octave

Octave is one of the most widely used open source tools for numerical analysis and liner algebra. Our project aims to improve Octave by introducing support for GPU computing in order to speed up some linear algebra operations. The core of our work is a C library that executes some BLAS operations concerning vector- vector, vector matrix and matrix-matrix functions on the GPU. OpenCL functions are used to program GPU kernels, which are bound within the GNU/octave framework. We report the project implementation design and some preliminary results about performance.

https://doi.org/10.1088/1742-6596/368/1/012062

012063

The following article is Open access

The NNPDF2.2 parton set

F Cerutti and N P Hartland

View article, The NNPDF2.2 parton set PDF, The NNPDF2.2 parton set

We present a method developed by the NNPDF Collaboration that allows the inclusion of new experimental data into an existing set of parton distribution functions without the need for a complete refit. A Monte Carlo ensemble of PDFs may be updated by assigning each member of the ensemble a unique weight determined by Bayesian inference. The reweighted ensemble therefore represents the probability density of PDFs conditional on both the old and new data. This method is applied to the inclusion of W-lepton asymmetry data into the NNPDF2.1 fit producing a new PDF set, NNPDF2.2.

https://doi.org/10.1088/1742-6596/368/1/012063

012064

The following article is Open access

Self-organizing maps algorithm for parton distribution functions extraction

Simonetta Liuti, Katherine A Holcomb and Evan Askanazi

View article, Self-organizing maps algorithm for parton distribution functions extraction PDF, Self-organizing maps algorithm for parton distribution functions extraction

We describe a new method to extract parton distribution functions from hard scattering processes based on Self-Organizing Maps. The extension to a larger, and more complex class of soft matrix elements, including generalized parton distributions is also discussed.

https://doi.org/10.1088/1742-6596/368/1/012064

Table of contents

Volume 368

14th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2011) 5–9 September 2011, Uxbridge, London, UK

Preface

Papers

Computing technologies for physics research

Data analysis – algorithms and tools

Computations in theoretical physics – techniques and methods

Journal links