This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy.

Table of contents

Volume 898

2017

Previous issue Next issue

Track 5: Software Development

Accepted papers received: 11 September 2017
Published online: 23 November 2017

072001
The following article is Open access

, , , and

This paper reports on the port of the ATLAS software stack onto new prototype ARM64 servers. This included building the "external" packages that the ATLAS software relies on. Patches were needed to introduce this new architecture into the build as well as patches that correct for platform specific code that caused failures on non-x86 architectures. These patches were applied such that porting to further platforms will need no or only very little adjustments. A few additional modifications were needed to account for the different operating system, Ubuntu instead of Scientific Linux 6 / CentOS7. Selected results from the validation of the physics outputs on these ARM 64-bit servers will be shown. CPU, memory and IO intensive benchmarks using ATLAS specific environment and infrastructure have been performed, with a particular emphasis on the performance vs. energy consumption.

072002
The following article is Open access

, and

The Alpha Magnetic Spectrometer (AMS) is a high energy physics experiment installed and operating on board the International Space Station (ISS) since May 2011 and expected to last beyond 2024. The details of porting the AMS software to the IBM Blue Gene/Q Architecture are discussed. The performance of the AMS reconstruction and simulation software in that architecture is evaluated and compared to the performance obtained on Intel based architecture.

072003
The following article is Open access

, and

Big Data technologies have proven to be very useful for storage, processing and visualization of derived metrics associated with ATLAS distributed computing (ADC) services. Logfiles, database records, and metadata from a diversity of systems have been aggregated and indexed to create an analytics platform for ATLAS ADC operations analysis. Dashboards, wide area data access cost metrics, user analysis patterns, and resource utilization efficiency charts are produced flexibly through queries against a powerful analytics cluster. Here we explore whether these techniques and associated analytics ecosystem can be applied to add new modes of open, quick, and pervasive access to ATLAS event data. Such modes would simplify access and broaden the reach of ATLAS public data to new communities of users. An ability to efficiently store, filter, search and deliver ATLAS data at the event and/or sub-event level in a widely supported format would enable or significantly simplify usage of machine learning environments and tools like Spark, Jupyter, R, SciPy, Caffe, TensorFlow, etc. Machine learning challenges such as the Higgs Boson Machine Learning Challenge, the Tracking challenge, Event viewers (VP1, ATLANTIS, ATLASrift), and still to be developed educational and outreach tools would be able to access the data through a simple REST API. In this preliminary investigation we focus on derived xAOD data sets. These are much smaller than the primary xAODs having containers, variables, and events of interest to a particular analysis. Being encouraged with the performance of Elasticsearch for the ADC analytics platform, we developed an algorithm for indexing derived xAOD event data. We have made an appropriate document mapping and have imported a full set of standard model W/Z datasets. We compare the disk space efficiency of this approach to that of standard ROOT files, the performance in simple cut flow type of data analysis, and will present preliminary results on its scaling characteristics with different numbers of clients, query complexity, and size of the data retrieved.

072004
The following article is Open access

, and

Binary decision trees under the Bayesian decision technique are used for supervised classification of high-dimensional data. We present a great potential of adaptive kernel density estimation as the nested separation method of the supervised binary divergence decision tree. Also, we provide a proof of alternative computing approach for kernel estimates utilizing Fourier transform. Further, we apply our method to Monte Carlo data set from the particle accelerator Tevatron at DØ experiment in Fermilab and provide final top-antitop signal separation results. We have achieved up to 82 % AUC while using the restricted feature selection entering the signal separation procedure.

072005
The following article is Open access

, , , , , , , and

Responding to the European Strategy for Particle Physics update 2013, the Future Circular Collider study explores scenarios of circular frontier colliders for the post-LHC era. One branch of the study assesses industrial approaches to model and simulate the reliability and availability of the entire particle collider complex based on the continuous monitoring of CERN's accelerator complex operation. The modelling is based on an in-depth study of the CERN injector chain and LHC, and is carried out as a cooperative effort with the HL-LHC project. The work so far has revealed that a major challenge is obtaining accelerator monitoring and operational data with sufficient quality, to automate the data quality annotation and calculation of reliability distribution functions for systems, subsystems and components where needed. A flexible data management and analytics environment that permits integrating the heterogeneous data sources, the domain-specific data quality management algorithms and the reliability modelling and simulation suite is a key enabler to complete this accelerator operation study. This paper describes the Big Data infrastructure and analytics ecosystem that has been put in operation at CERN, serving as the foundation on which reliability and availability analysis and simulations can be built. This contribution focuses on data infrastructure and data management aspects and presents case studies chosen for its validation.

072006
The following article is Open access

and

EMMA is a framework designed to create a family of configurable software systems, with emphasis on extensibility and flexibility. It is based on a loosely coupled, event driven architecture. The EMMA framework has been built upon the premise of composing software systems from independent components. It opens up opportunities for reuse of components and their functionality and composing them together in many different ways. It provides the developer of test and measurement applications with a lightweight alternative to microservices, while sharing their various advantages, including composability, loose coupling, encapsulation, and reuse.

072007
The following article is Open access

and

ALICE (A Large Ion Collider Experiment) is the dedicated heavy-ion detector studying the physics of strongly interacting matter and the quark-gluon plasma at the CERN LHC (Large Hadron Collider). After the second long shut-down of the LHC, the ALICE detector will be upgraded to cope with an interaction rate of 50 kHz in Pb-Pb collisions, producing in the online computing system (O2) a sustained throughput of 3.4 TB/s. This data will be processed on the fly so that the stream to permanent storage does not exceed 90 GB/s peak, the raw data being discarded. In the context of assessing different computing platforms for the O2 system, we have developed a framework for the Intel Xeon Phi processors (MIC). It provides the components to build a processing pipeline streaming the data from the PC memory to a pool of permanent threads running on the MIC, and back to the host after processing. It is based on explicit offloading mechanisms (data transfer, asynchronous tasks) and basic building blocks (FIFOs, memory pools, C++11 threads). The user only needs to implement the processing method to be run on the MIC. We present in this paper the architecture, implementation, and performance of this system.

072008
The following article is Open access

and

A Large Ion Collider Experiment (ALICE) is one of the four big experiments running at the Large Hadron Collider (LHC), which focuses on the study of the Quark-Gluon Plasma (QGP) being produced in heavy-ion collisions. The ALICE Event Visualisation Environment (AliEve) is a tool providing an interactive 3D model of the detector's geometry and a graphical representation of the data. Together with the online reconstruction module, it provides important quality monitoring of the recorded data. As a consequence it has been used in the ALICE Run Control Centre during all stages of Run 2. Static screenshots from the online visualisation are published on the public website - ALICE LIVE. Dedicated converters have been developed to provide geometry and data for external projects. An example of such project is the Total Event Display (TEV) - a visualisation tool recently developed by the CERN Media Lab based on the Unity game engine. It can be easily deployed on any platform, including web and mobile platforms. Another external project is More Than ALICE - an augmented reality application for visitors, overlaying detector descriptions and event visualisations on the camera's picture. For the future Run 3 both AliEve and TEV will be adapted to fit the ALICE O2 project. Several changes are required due to the new data formats, especially so-called Compressed Time Frames.

072009
The following article is Open access

, , , and

The ATLAS software infrastructure facilitates efforts of more than 1000 developers working on the code base of 2200 packages with 4 million lines of C++ and 1.4 million lines of python code. The ATLAS offline code management system is the powerful, flexible framework for processing new package versions requests, probing code changes in the Nightly Build System, migration to new platforms and compilers, deployment of production releases for worldwide access and supporting physicists with tools and interfaces for efficient software use. It maintains multi-stream, parallel development environment with about 70 multi-platform branches of nightly releases and provides vast opportunities for testing new packages, for verifying patches to existing software and for migrating to new platforms and compilers. The system evolution is currently aimed on the adoption of modern continuous integration (CI) practices focused on building nightly releases early and often, with rigorous unit and integration testing. This paper describes the CI incorporation program for the ATLAS software infrastructure. It brings modern open source tools such as Jenkins and GitLab into the ATLAS Nightly System, rationalizes hardware resource allocation and administrative operations, provides improved feedback and means to fix broken builds promptly for developers. Once adopted, ATLAS CI practices will improve and accelerate innovation cycles and result in increased confidence in new software deployments. The paper reports the status of Jenkins integration with the ATLAS Nightly System as well as short and long term plans for the incorporation of CI practices.

072010
The following article is Open access

, , , and

The offline software of the ATLAS experiment at the Large Hadron Collider (LHC) serves as the platform for detector data reconstruction, simulation and analysis. It is also used in the detector's trigger system to select LHC collision events during data taking. The ATLAS offline software consists of several million lines of C++ and Python code organized in a modular design of more than 2000 specialized packages. Because of different workflows, many stable numbered releases are in parallel production use. To accommodate specific workflow requests, software patches with modified libraries are distributed on top of existing software releases on a daily basis. The different ATLAS software applications also require a flexible build system that strongly supports unit and integration tests. Within the last year this build system was migrated to CMake.

A CMake configuration has been developed that allows one to easily set up and build the above mentioned software packages. This also makes it possible to develop and test new and modified packages on top of existing releases. The system also allows one to detect and execute partial rebuilds of the release based on single package changes. The build system makes use of CPack for building RPM packages out of the software releases, and CTest for running unit and integration tests.

We report on the migration and integration of the ATLAS software to CMake and show working examples of this large scale project in production.

072011
The following article is Open access

, , , , , , and

In this paper we explain how the C++ code quality is managed in ATLAS using a range of tools from compile-time through to run time testing and reflect on the substantial progress made in the last two years largely through the use of static analysis tools such as Coverity®, an industry-standard tool which enables quality comparison with general open source C++ code. Other available code analysis tools are also discussed, as is the role of unit testing with an example of how the GoogleTest framework can be applied to our codebase.

072012
The following article is Open access

, , , , , , , , and

Experimental Particle Physics has been at the forefront of analyzing the worlds largest datasets for decades. The HEP community was the first to develop suitable software and computing tools for this task. In recent times, new toolkits and systems collectively called Big Data technologies have emerged to support the analysis of Petabyte and Exabyte datasets in industry. While the principles of data analysis in HEP have not changed (filtering and transforming experiment-specific data formats), these new technologies use different approaches and promise a fresh look at analysis of very large datasets and could potentially reduce the time-to-physics with increased interactivity.

In this talk, we present an active LHC Run 2 analysis, searching for dark matter with the CMS detector, as a testbed for Big Data technologies. We directly compare the traditional NTuple-based analysis with an equivalent analysis using Apache Spark on the Hadoop ecosystem and beyond. In both cases, we start the analysis with the official experiment data formats and produce publication physics plots. We will discuss advantages and disadvantages of each approach and give an outlook on further studies needed.

072013
The following article is Open access

, and

As the ATLAS Experiment prepares to move to a multi-threaded framework (AthenaMT) for Run3, we are faced with the problem of how to migrate 4 million lines of C++ source code. This code has been written over the past 15 years and has often been adapted, re-written or extended to the changing requirements and circumstances of LHC data taking. The code was developed by different authors, many of whom are no longer active, and under the deep assumption that processing ATLAS data would be done in a serial fashion.

In order to understand the scale of the problem faced by the ATLAS software community, and to plan appropriately the significant efforts posed by the new AthenaMT framework, ATLAS embarked on a wide ranging review of our offline code, covering all areas of activity: event generation, simulation, trigger, reconstruction. We discuss the difficulties in even logistically organising such reviews in an already busy community, how to examine areas in sufficient depth to learn key areas in need of upgrade, yet also to finish the reviews in a timely fashion.

We show how the reviews were organised and how the ouptuts were captured in a way that the sub-system communities could then tackle the problems uncovered on a realistic timeline. Further, we discuss how the review has inuenced the overall planning for the Run 3 ATLAS offline code.

072014
The following article is Open access

, , , , , , , and

At the beginning, HEP experiments made use of photographical images both to record and store experimental data and to illustrate their findings. Then the experiments evolved and needed to find ways to visualize their data. With the availability of computer graphics, software packages to display event data and the detector geometry started to be developed.

Here, an overview of the usage of event display tools in HEP is presented. Then the case of the ATLAS experiment is considered in more detail and two widely used event display packages are presented, Atlantis and VP1, focusing on the software technologies they employ, as well as their strengths, differences and their usage in the experiment: from physics analysis to detector development, and from online monitoring to outreach and communication. Towards the end, the other ATLAS visualization tools will be briefly presented as well. Future development plans and improvements in the ATLAS event display packages will also be discussed.

072015
The following article is Open access

, and

The complex geometry of the whole detector of the ATLAS experiment at LHC is currently stored only in custom online databases, from which it is built on-the-fly on request. Accessing the online geometry guarantees accessing the latest version of the detector description, but requires the setup of the full ATLAS software framework "Athena", which provides the online services and the tools to retrieve the data from the database. This operation is cumbersome and slows down the applications that need to access the geometry. Moreover, all applications that need to access the detector geometry need to be built and run on the same platform as the ATLAS framework, preventing the usage of the actual detector geometry in stand-alone applications.

Here we propose a new mechanism to persistify (in software development in general, and in HEP computing in particular, persistifying means taking an object which lives in memory only - for example because it was built on-the-fly while processing the experimental data, - serializing it and storing it on disk as a persistent object) and serve the geometry of HEP experiments. The new mechanism is composed by a new file format and the modules to make use of it. The new file format allows to store the whole detector description locally in a file, and it is especially optimized to describe large complex detectors with the minimum file size, making use of shared instances and storing compressed representations of geometry transformations. Then, the detector description can be read back in, to fully restore the in-memory geometry tree.

Moreover, a dedicated REST API is being designed and developed to serve the geometry in standard exchange formats like JSON, to let users and applications download specific partial geometry information.

With this new geometry persistification a new generation of applications could be developed, which can use the actual detector geometry while being platform-independent and experiment-independent.

072016
The following article is Open access

, , , , , and

The GooFit Framework is designed to perform maximum-likelihood fits for arbitrary functions on various parallel back ends, for example a GPU. We present an extension to GooFit which adds the functionality to perform time-dependent amplitude analyses of pseudoscalar mesons decaying into four pseudoscalar final states. Benchmarks of this functionality show a significant performance increase when utilizing a GPU compared to a CPU. Furthermore, this extension is employed to study the sensitivity on the ${{\rm{D}}}^{0}-{\bar{{\rm{D}}}}^{0}$ mixing parameters x and y in a time-dependent amplitude analysis of the decay D0 → K+ππ+π. Studying a sample of 50 000 events and setting the central values to the world average of x = (0.49 ± 0.15)% and y = (0.61 ± 0.08)%, the statistical sensitivities of x and y are determined to be σ(x) = 0.019 % and σ(y) = 0.019 %.

072017
The following article is Open access

, , , , , , , , , et al

The recent progress in parallel hardware architectures with deeper vector pipelines or many-cores technologies brings opportunities for HEP experiments to take advantage of SIMD and SIMT computing models. Launched in 2013, the GeantV project studies performance gains in propagating multiple particles in parallel, improving instruction throughput and data locality in HEP event simulation on modern parallel hardware architecture. Due to the complexity of geometry description and physics algorithms of a typical HEP application, performance analysis is indispensable in identifying factors limiting parallel execution. In this report, we will present design considerations and preliminary computing performance of GeantV physics models on coprocessors (Intel Xeon Phi and NVidia GPUs) as well as on mainstream CPUs.

072018
The following article is Open access

, , , , , and

Some data analysis methods typically used in econometric studies and in ecology have been evaluated and applied in physics software environments. They concern the evolution of observables through objective identification of change points and trends, and measurements of inequality, diversity and evenness across a data set. Within each analysis area, various statistical tests and measures have been examined. This conference paper summarizes a brief overview of some of these methods.

072019
The following article is Open access

, and

A possible solution to the Dark Matter problem postulates that it interacts with Standard Model particles through a new force mediated by a "portal". If the new force has a U(1) gauge structure, the "portal" is a massive photon-like vector particle, called dark photon or A'. The PADME experiment at the DAΦNE Beam-Test Facility (BTF) in Frascati is designed to detect dark photons produced in positron on fixed target annihilations decaying to dark matter (e+e-→γA') by measuring the final state missing mass. One of the key roles of the experiment will be played by the electromagnetic calorimeter, which will be used to measure the properties of the final state recoil γ. The calorimeter will be composed by 616 21×21×230 mm3 BGO crystals oriented with the long axis parallel to the beam direction and disposed in a roughly circular shape with a central hole to avoid the pile up due to the large number of low angle Bremsstrahlung photons. The total energy and position of the electromagnetic shower generated by a photon impacting on the calorimeter can be reconstructed by collecting the energy deposits in the cluster of crystals interested by the shower. In PADME we are testing two different clustering algorithms, PADME-Radius and PADME-Island, based on two complementary strategies. In this paper we will describe the two algorithms, with the respective implementations, and report on the results obtained with them at the PADME energy scale (< 1 GeV), both with a GEANT4 based simulation and with an existing 5×5 matrix of BGO crystals tested at the DAΦNE BTF.

072020
The following article is Open access

, , and

Today's analyses for high-energy physics (HEP) experiments involve processing a large amount of data with highly specialized algorithms. The contemporary workflow from recorded data to final results is based on the execution of small scripts – often written in Python or ROOT macros which call complex compiled algorithms in the background – to perform fitting procedures and generate plots. During recent years interactive programming environments, such as Jupyter, became popular. Jupyter allows to develop Python-based applications, so-called notebooks, which bundle code, documentation and results, e.g. plots. Advantages over classical script-based approaches is the feature to recompute only parts of the analysis code, which allows for fast and iterative development, and a web-based user frontend, which can be hosted centrally and only requires a browser on the user side. In our novel approach, Python and Jupyter are tightly integrated into the Belle II Analysis Software Framework (basf2), currently being developed for the Belle II experiment in Japan. This allows to develop code in Jupyter notebooks for every aspect of the event simulation, reconstruction and analysis chain. These interactive notebooks can be hosted as a centralized web service via jupyterhub with docker and used by all scientists of the Belle II Collaboration. Because of its generality and encapsulation, the setup can easily be scaled to large installations.

072021
The following article is Open access

, , and

We review the concept of Support Vector Machines (SVMs) and discuss examples of their use in a number of scenarios. Several SVM implementations have been used in HEP and we exemplify this algorithm using the Toolkit for Multivariate Analysis (TMVA) implementation. We discuss examples relevant to HEP including background suppression for Hτ+τ at the LHC with several different kernel functions. Performance benchmarking leads to the issue of generalisation of hyper-parameter selection. The avoidance of fine tuning (over training or over fitting) in MVA hyper-parameter optimisation, i.e. the ability to ensure generalised performance of an MVA that is independent of the training, validation and test samples, is of utmost importance. We discuss this issue and compare and contrast performance of hold-out and k-fold cross-validation. We have extended the SVM functionality and introduced tools to facilitate cross validation in TMVA and present results based on these improvements.

072022
The following article is Open access

, , , , , , and

The need for processing the ever-increasing amount of data generated by the LHC experiments in a more efficient way has motivated ROOT to further develop its support for parallelism. Such support is being tackled both for shared-memory and distributed-memory environments.

The incarnations of the aforementioned parallelism are multi-threading, multi-processing and cluster-wide executions. In the area of multi-threading, we discuss the new implicit parallelism and related interfaces, as well as the new building blocks to safely operate with ROOT objects in a multi-threaded environment. Regarding multi-processing, we review the new MultiProc framework, comparing it with similar tools (e.g. multiprocessing module in Python). Finally, as an alternative to PROOF for cluster-wide executions, we introduce the efforts on integrating ROOT with state-of-the-art distributed data processing technologies like Spark, both in terms of programming model and runtime design (with EOS as one of the main components).

For all the levels of parallelism, we discuss, based on real-life examples and measurements, how our proposals can increase the productivity of scientists.

072023
The following article is Open access

ROOT comes with a C++ compliant interpreter cling. Cling needs to understand the content of the libraries in order to interact with them. Exposing the full shared library descriptors to the interpreter at runtime translates into increased memory footprint. ROOT's exploratory programming concepts allow implicit and explicit runtime shared library loading. It requires the interpreter to load the library descriptor. Re-parsing of descriptors' content has a noticeable effect on the runtime performance. Present state-of-art lazy parsing technique brings the runtime performance to reasonable levels but proves to be fragile and can introduce correctness issues. An elegant solution is to load information from the descriptor lazily and in a non-recursive way.

The LLVM community advances its C++ Modules technology providing an io-efficient, on-disk representation capable to reduce build times and peak memory usage. The feature is standardized as a C++ technical specification. C++ Modules are a flexible concept, which can be employed to match CMS and other experiments' requirement for ROOT: to optimize both runtime memory usage and performance. Cling technically "inherits" the feature, however tweaking it to ROOT scale and beyond is a complex endeavor. The paper discusses the status of the C++ Modules in the context of ROOT, supported by few preliminary performance results. It shows a step-by-step migration plan and describes potential challenges which could appear.

072024
The following article is Open access

, , and

Due to user demand and to support new development workflows based on code review and multiple development streams, LHCb decided to port the source code management from Subversion to Git, using the CERN GitLab hosting service. Although tools exist for this kind of migration, LHCb specificities and development models required careful planning of the migration, development of migration tools, changes to the development model, and redefinition of the release procedures. Moreover we had to support a hybrid situation with some software projects hosted in Git and others still in Subversion, or even branches of one projects hosted in different systems.

We present the way we addressed the special LHCb requirements, the technical details of migrating large non standard Subversion repositories, and how we managed to smoothly migrate the software projects following the schedule of each project manager.

072025
The following article is Open access

, , and

The functionality of GooFit, a GPU-friendly framework for doing maximum-likelihood fits, has been extended to extract model-independent ${\mathscr{S}}$-wave amplitudes in three-body decays such as D+h+h+h. A full amplitude analysis is done where the magnitudes and phases of the ${\mathscr{S}}$-wave amplitudes are anchored at a finite number of m2(h+h) control points, and a cubic spline is used to interpolate between these points. The amplitudes for ${\mathscr{P}}$-wave and ${\mathscr{D}}$-wave intermediate states are modeled as spin-dependent Breit-Wigner resonances. GooFit uses the Thrust library, with a CUDA backend for NVIDIA GPUs and an OpenMP backend for threads with conventional CPUs. Performance on a variety of platforms is compared. Executing on systems with GPUs is typically a few hundred times faster than executing the same algorithm on a single CPU.

072026
The following article is Open access

, , , , , , , , , et al

Monte Carlo (MC) simulation production plays an important role in physics analysis of the Alpha Magnetic Spectrometer (AMS-02) experiment. To facilitate the metadata retrieving for data analysis among millions of database records, we developed a monitoring tool to analyse and visualize the production status and progress. In this paper, we discuss the workflow of the monitoring tool and present its features and technical details.

072027
The following article is Open access

, , , , and

Performance data and metadata of the computing operations at the CMS experiment are collected through a distributed monitoring infrastructure, currently relying on a traditional Oracle database system. This paper shows how to harness Big Data architectures in order to improve the throughput and the efficiency of such monitoring. A large set of operational data - user activities, job submissions, resources, file transfers, site efficiencies, software releases, network traffic, machine logs - is being injected into a readily available Hadoop cluster, via several data streamers. The collected metadata is further organized running fast arbitrary queries; this offers the ability to test several Map&Reduce-based frameworks and measure the system speed-up when compared to the original database infrastructure. By leveraging a quality Hadoop data store and enabling an analytics framework on top, it is possible to design a mining platform to predict dataset popularity and discover patterns and correlations.

072028
The following article is Open access

, , , , , and

We present rootJS, an interface making it possible to seamlessly integrate ROOT 6 into applications written for Node.js, the JavaScript runtime platform increasingly commonly used to create high-performance Web applications. ROOT features can be called both directly from Node.js code and by JIT-compiling C++ macros. All rootJS methods are invoked asynchronously and support callback functions, allowing non-blocking operation of Node.js applications using them. Last but not least, our bindings have been designed to platform-independent and should therefore work on all systems supporting both ROOT 6 and Node.js.

Thanks to rootJS it is now possible to create ROOT-aware Web applications taking full advantage of the high performance and extensive capabilities of Node.js. Examples include platforms for the quality assurance of acquired, reconstructed or simulated data, book-keeping and e-log systems, and even Web browser-based data visualisation and analysis.

072029
The following article is Open access

, , , , , and

Over the last seven years the software stack of the next generation B factory experiment Belle II has grown to over one million lines of C++ and Python code, counting only the part included in offline software releases. There are several thousand commits to the central repository by about 100 individual developers per year. To keep a coherent software stack of high quality that it can be sustained and used efficiently for data acquisition, simulation, reconstruction, and analysis over the lifetime of the Belle II experiment is a challenge.

A set of tools is employed to monitor the quality of the software and provide fast feedback to the developers. They are integrated in a machinery that is controlled by a buildbot master and automates the quality checks. The tools include different compilers, cppcheck, the clang static analyzer, valgrind memcheck, doxygen, a geometry overlap checker, a check for missing or extra library links, unit tests, steering file level tests, a sophisticated high-level validation suite, and an issue tracker. The technological development infrastructure is complemented by organizational means to coordinate the development.

072030
The following article is Open access

Modern web browsers are powerful and sophisticated applications that support an ever-wider range of uses. One such use is rendering high-quality, GPU-accelerated, interactive 2D and 3D graphics in an HTML canvas. This can be done via WebGL, a JavaScript API based on OpenGL ES. Applications delivered via the browser have several distinct benefits for the developer and user. For example, they can be implemented using well-known and well-developed technologies, while distribution and use via a browser allows for rapid prototyping and deployment and ease of installation. In addition, delivery of applications via the browser allows for easy use on mobile, touch-enabled devices such as phones and tablets.

iSpy WebGL is an application for visualization of events detected and reconstructed by the CMS Experiment at the Large Hadron Collider at CERN. The first event display developed for an LHC experiment to use WebGL, iSpy WebGL is a client-side application written in JavaScript, HTML, and CSS and uses the WebGL API three.js. iSpy WebGL is used for monitoring of CMS detector performance, for production of images and animations of CMS collisions events for the public, as a virtual reality application using Google Cardboard, and asa tool available for public education and outreach such as in the CERN Open Data Portal and the CMS masterclasses. We describe here its design, development, and usage as well as future plans.

072031
The following article is Open access

and

HEP applications perform an excessive amount of allocations/deallocations within short time intervals which results in memory churn, poor locality and performance degradation. These issues are already known for a decade, but due to the complexity of software frameworks and billions of allocations for a single job, up until recently no efficient mechanism has been available to correlate these issues with source code lines. However, with the advent of the Big Data era, many tools and platforms are now available to do large scale memory profiling. This paper presents, a prototype program developed to track and identify each single (de-)allocation. The CERN IT Hadoop cluster is used to compute memory key metrics, like locality, variation, lifetime and density of allocations. The prototype further provides a web based visualization back-end that allows the user to explore the results generated on the Hadoop cluster. Plotting these metrics for every single allocation over time gives a new insight into application's memory handling. For instance, it shows which algorithms cause which kind of memory allocation patterns, which function flow causes how many short-lived objects, what are the most commonly allocated sizes etc. The paper will give an insight into the prototype and will show profiling examples for the LHC reconstruction, digitization and simulation jobs.

072032
The following article is Open access

, and

The VecGeom geometry library is a relatively recent effort aiming to provide a modern and high performance geometry service for particle detector simulation in hierarchical detector geometries common to HEP experiments. One of its principal targets is the efficient use of vector SIMD hardware instructions to accelerate geometry calculations for single track as well as multi-track queries.

Previously, excellent performance improvements compared to Geant4/ROOT could be reported for elementary geometry algorithms at the level of single shape queries. In this contribution, we will focus on the higher level navigation algorithms in VecGeom, which are the most important components as seen from the simulation engines. We will first report on our R&D effort and developments to implement SIMD enhanced data structures to speed up the well-known "voxelised" navigation algorithms, ubiquitously used for particle tracing in complex detector modules consisting of many daughter parts.

Second, we will discuss complementary new approaches to improve navigation algorithms in HEP. These ideas are based on a systematic exploitation of static properties of the detector layout as well as automatic code generation and specialisation of the C++ navigator classes. Such specialisations reduce the overhead of generic- or virtual function based algorithms and enhance the effectiveness of the SIMD vector units.

These novel approaches go well beyond the existing solutions available in Geant4 or TGeo/ROOT, achieve a significantly superior performance, and might be of interest for a wide range of simulation backends (GeantV, Geant4). We exemplify this with concrete benchmarks for the CMS and ALICE detectors.

072033
The following article is Open access

and

Gas based detector R&D relies heavily on full simulation of detectors and their optimization before final prototypes can be built and tested. These simulations in particular those with complex scenarios such as those involving high detector voltages or gas with larger gains are computationally intensive may take several days or weeks to complete. These long-running simulations usually run on the high-performance computers in batch mode. If the results lead to unexpected behavior, then the simulation might be rerun with different parameters. However, the simulations (or jobs) may have to wait in a queue until they get a chance to run again because the supercomputer is a shared resource that maintains a queue of other user programs as well and executes them as time and priorities permit. It may result in inefficient resource utilization and increase in the turnaround time for the scientific experiment. To overcome this issue, the monitoring of the behavior of a simulation, while it is running (or live), is essential. In this work, we employ the computational steering technique by coupling the detector simulations with a visualization package named VisIt to enable the exploration of the live data as it is produced by the simulation.

072034
The following article is Open access

, , and

The statistical analysis of infrastructure metrics comes with several specific challenges, including the fairly large volume of unstructured metrics from a large set of independent data sources. Hadoop and Spark provide an ideal environment in particular for the first steps of skimming rapidly through hundreds of TB of low relevance data to find and extract the much smaller data volume that is relevant for statistical analysis and modelling. This presentation will describe the new Hadoop service at CERN and the use of several of its components for high throughput data aggregation and ad-hoc pattern searches. We will describe the hardware setup used, the service structure with a small set of decoupled clusters and the first experience with co-hosting different applications and performing software upgrades. We will further detail the common infrastructure used for data extraction and preparation from continuous monitoring and database input sources.

072035
The following article is Open access

and

The IT Analysis Working Group (AWG) has been formed at CERN across individual computing units and the experiments to attempt a cross cutting analysis of computing infrastructure and application metrics. In this presentation we will describe the first results obtained using medium/long term data (1 months — 1 year) correlating box level metrics, job level metrics from LSF and HTCondor, IO metrics from the physics analysis disk pools (EOS) and networking and application level metrics from the experiment dashboards. We will cover in particular the measurement of hardware performance and prediction of job duration, the latency sensitivity of different job types and a search for bottlenecks with the production job mix in the current infrastructure. The presentation will conclude with the proposal of a small set of metrics to simplify drawing conclusions also in the more constrained environment of public cloud deployments.

072036
The following article is Open access

In order to test the computing capabilities of GPUs with respect to traditional CPU cores a high-statistics toy Monte Carlo technique has been implemented both in ROOT/RooFit and GooFit frameworks with the purpose to estimate the statistical significance of the structure observed by CMS close to the kinematical boundary of the J/ψϕ invariant mass in the three-body decay B+J/ψϕK+. GooFit is a data analysis open tool under development that interfaces ROOT/RooFit to CUDA platform on nVidia GPU. The optimized GooFit application running on GPUs hosted by servers in the Bari Tier2 provides striking speed-up performances with respect to the RooFit application parallelised on multiple CPUs by means of PROOF-Lite tool. The considerable resulting speed-up, evident when comparing concurrent GooFit processes allowed by CUDA Multi Process Service and a RooFit/PROOF-Lite process with multiple CPU workers, is presented and discussed in detail. By means of GooFit it has also been possible to explore the behaviour of a likelihood ratio test statistic in different situations in which the Wilks Theorem may or may not apply because its regularity conditions are not satisfied.

072037
The following article is Open access

, , and

Any time you modify an implementation within a program, change compiler version or operating system, you should also do regression testing. You can do regression testing by rerunning existing tests against the changes to determine whether this breaks anything that worked prior to the change and by writing new tests where necessary. At LHCb we have a huge codebase which is maintained by many people and can be run within different setups. Such situations lead to the crucial necessity to guide refactoring with a central profiling system that helps to run tests and find the impact of changes.

In our work we present a software architecture and tools for running a profiling system. This system is responsible for systematically running regression tests, collecting and comparing results of these tests so changes between different setups can be observed and reported. The main feature of our solution is that it is based on a microservices architecture. Microservices break a large project into loosely coupled modules, which communicate with each other through simple APIs. Such modular architectural style helps us to avoid general pitfalls of monolithic architectures such as hard to understand a codebase as well as maintaining a large codebase and ineffective scalability. Our solution also allows to escape a complexity of microservices deployment process by using software containers and services management tools. Containers and service managers let us quickly deploy linked modules in development, production or in any other environments. Most of the developed modules are generic which means that the proposed architecture and tools can be used not only in LHCb but adopted for other experiments and companies.

072038
The following article is Open access

, and

This contribution is about sharing our recent experiences of building Hadoop based application. Hadoop ecosystem now offers myriad of tools which can overwhelm new users, yet there are successful ways these tools can be leveraged to solve problems. We look at factors to consider when using Hadoop to model and store data, best practices for moving data in and out of the system and common processing patterns, at each stage relating with the real world experience gained while developing such application. We share many of the design choices, tools developed and how to profile a distributed application which can be applied for other scenarios as well. In conclusion, the goal of the presentation is to provide guidance to architect Hadoop based application and share some of the reusable components developed in this process.

072039
The following article is Open access

, and

PODIO is a C++ library that supports the automatic creation of event data models (EDMs) and efficient I/O code for HEP experiments. It is developed as a new EDM Toolkit for future particle physics experiments in the context of the AIDA2020 EU programme. Experience from LHC and the linear collider community shows that existing solutions partly suffer from overly complex data models with deep object-hierarchies or unfavorable I/O performance. The PODIO project was created in order to address these problems. PODIO is based on the idea of employing plain-old-data (POD) data structures wherever possible, while avoiding deep object-hierarchies and virtual inheritance. At the same time it provides the necessary high-level interface towards the developer physicist, such as the support for inter-object relations and automatic memory-management, as well as a Python interface. To simplify the creation of efficient data models PODIO employs code generation from a simple yaml-based markup language. In addition, it was developed with concurrency in mind in order to support the use of modern CPU features, for example giving basic support for vectorization techniques.

072040
The following article is Open access

, , , and

$\bar{{\rm{P}}}$ANDA is a future hadron and nuclear physics experiment at the FAIR facility in construction in Darmstadt, Germany. Unlike the majority of current experiments, $\bar{{\rm{P}}}$ANDA's strategy for data acquisition is based on online event reconstruction from free-streaming data, performed in real time entirely by software algorithms using global detector information. This paper reports on the status of the development of algorithms for the reconstruction of charged particle tracks, targeted towards online data processing applications, designed for execution on data-parallel processors such as GPUs (Graphic Processing Units). Two parallel algorithms for track finding, derived from the Circle Hough algorithm, are being developed to extend the parallelism to all stages of the algorithm. The concepts of the algorithms are described, along with preliminary results and considerations about their implementations and performance.

072041
The following article is Open access

, and

ParaView is a high performance visualization application not widely used in High Energy Physics (HEP). It is a long standing open source project led by Kitware and involves several Department of Energy (DOE) and Department of Defense (DOD) laboratories. Futhermore, it has been adopted by many DOE supercomputing centers and other sites. ParaView is unique in speed and efficiency by using state-of-the-art techniques developed by the academic visualization community that are often not found in applications written by the HEP community. In-situ visualization of events, where event details are visualized during processing/analysis, is a common task for experiment software frameworks. Kitware supplies Catalyst, a library that enables scientific software to serve visualization objects to client ParaView viewers yielding a real-time event display. Connecting ParaView to the Fermilab art framework will be described and the capabilities it brings discussed.

072042
The following article is Open access

, and

Over time computer scientists have been provided metrics to measure software maintainability. In existing literature, a large number of references can be found about this topic; nevertheless, a lack of quantitative assessment of maintainability metrics has been observed. In this paper, we summarize the challenges of adopting code measurements in the context of physics software system. In this pilot study, we have used Geant4 - a twenty-year-old software system - to conduct this research and set the grounds for further discussion.

072043
The following article is Open access

and

ROOT provides an flexible format used throughout the HEP community. The number of use cases - from an archival data format to end-stage analysis - has required a number of tradeoffs to be exposed to the user. For example, a high "compression level" in the traditional DEFLATE algorithm will result in a smaller file (saving disk space) at the cost of slower decompression (costing CPU time when read). At the scale of the LHC experiment, poor design choices can result in terabytes of wasted space or wasted CPU time. We explore and attempt to quantify some of these tradeoffs. Specifically, we explore: the use of alternate compressing algorithms to optimize for read performance; an alternate method of compressing individual events to allow efficient random access; and a new approach to whole-file compression. Quantitative results are given, as well as guidance on how to make compression decisions for different use cases.

072044
The following article is Open access

, , , and

Exascale computing resources are roughly a decade away and will be capable of 100 times more computing than current supercomputers. In the last year, Energy Frontier experiments crossed a milestone of 100 million core-hours used at the Argonne Leadership Computing Facility, Oak Ridge Leadership Computing Facility, and NERSC. The Fortran-based leading-order parton generator called Alpgen was successfully scaled to millions of threads to achieve this level of usage on Mira. Sherpa and MadGraph are next-to-leading order generators used heavily by LHC experiments for simulation. Integration times for high-multiplicity or rare processes can take a week or more on standard Grid machines, even using all 16-cores. We will describe our ongoing work to scale the Sherpa generator to thousands of threads on leadership-class machines and reduce run-times to less than a day. This work allows the experiments to leverage large-scale parallel supercomputers for event generation today, freeing tens of millions of grid hours for other work, and paving the way for future applications (simulation, reconstruction) on these and future supercomputers.

072045
The following article is Open access

, , , , , , , , and

The Visual Physics Analysis (VISPA) project defines a toolbox for accessing software via the web. It is based on latest web technologies and provides a powerful extension mechanism that enables to interface a wide range of applications. Beyond basic applications such as a code editor, a file browser, or a terminal, it meets the demands of sophisticated experiment-specific use cases that focus on physics data analyses and typically require a high degree of interactivity. As an example, we developed a data inspector that is capable of browsing interactively through event content of several data formats, e.g., MiniAOD which is utilized by the CMS collaboration. The VISPA extension mechanism can also be used to embed external web-based applications that benefit from dynamic allocation of user-defined computing resources via SSH. For example, by wrapping the JSROOT project, ROOT files located on any remote machine can be inspected directly through a VISPA server instance. We introduced domains that combine groups of users and role-based permissions. Thereby, tailored projects are enabled, e.g. for teaching where access to student's homework is restricted to a team of tutors, or for experiment-specific data that may only be accessible for members of the collaboration. We present the extension mechanism including corresponding applications and give an outlook onto the new permission system.

072046
The following article is Open access

, , , , , , , , and

ROOT is a software framework for large-scale data analysis that provides basic and advanced statistical methods used by high-energy physics experiments. It includes machine learning tools from the ROOT-integrated Toolkit for Multivariate Analysis (TMVA). We present several recent developments in TMVA, including a new modular design, new algorithms for pre-processing, cross-validation, hyperparameter-tuning, deep-learning and interfaces to other machine-learning software packages. TMVA is additionally integrated with Jupyter, making it accessible with a browser.

072047
The following article is Open access

, , and

In high-energy particle physics, workflow management systems are primarily used as tailored solutions in dedicated areas such as Monte Carlo production. However, physicists performing data analyses are usually required to steer their individual workflows manually which is time-consuming and often leads to undocumented relations between particular workloads. We present a generic analysis design pattern that copes with the sophisticated demands of end-to-end HEP analyses and provides a make-like execution system. It is based on the open-source pipelining package Luigi which was developed at Spotify and enables the definition of arbitrary workloads, so-called Tasks, and the dependencies between them in a lightweight and scalable structure. Further features are multi-user support, automated dependency resolution and error handling, central scheduling, and status visualization in the web. In addition to already built-in features for remote jobs and file systems like Hadoop and HDFS, we added support for WLCG infrastructure such as LSF and CREAM job submission, as well as remote file access through the Grid File Access Library. Furthermore, we implemented automated resubmission functionality, software sandboxing, and a command line interface with auto-completion for a convenient working environment. For the implementation of a t$\overline{{{t}}}$H cross section measurement, we created a generic Python interface that provides programmatic access to all external information such as datasets, physics processes, statistical models, and additional files and values. In summary, the setup enables the execution of the entire analysis in a parallelized and distributed fashion with a single command.

072048
The following article is Open access

, , , , , , , , , et al

A system has been developed to provide flexible, efficient and robust processing of radiotherapy planning and treatment data collected in the VoxTox project, which investigates differences between planned and delivered dose, and dose-toxicity correlations. This paper outlines the system requirements and implementation, highlighting the use made of software tools and computing models developed for experiments at the Large Hadron Collider. Experience with VoxTox data processing is summarised.

072049
The following article is Open access

, , , , , and

LHC Run3 and Run4 represent an unprecedented challenge for HEP computing in terms of both data volume and complexity. New approaches are needed for how data is collected and filtered, processed, moved, stored and analysed if these challenges are to be met with a realistic budget. To develop innovative techniques we are fostering relationships with industry leaders. CERN openlab is a unique resource for public-private partnership between CERN and leading Information Communication and Technology (ICT) companies. Its mission is to accelerate the development of cutting-edge solutions to be used by the worldwide HEP community. In 2015, CERN openlab started its phase V with a strong focus on tackling the upcoming LHC challenges. Several R&D programs are ongoing in the areas of data acquisition, networks and connectivity, data storage architectures, computing provisioning, computing platforms and code optimisation and data analytics. This paper gives an overview of the various innovative technologies that are currently being explored by CERN openlab V and discusses the long-term strategies that are pursued by the LHC communities with the help of industry in closing the technological gap in processing and storage needs expected in Run3 and Run4.

072050
The following article is Open access

, , , , and

The Daya Bay experiment uses reactor antineutrino disappearance to measure the θ13 neutrino oscillation parameter. In this proceeding, the convolutional autoencoder machine learning technique is tested against a well-understood uncorrelated accidental background. The eventual goal for this technique is to reduce the background with the largest contribution to the rate uncertainty in the antineutrino data set, β-n decay of 9Li produced by cosmic-ray muons.

072051
The following article is Open access

, , and

Modern science clearly demands for a higher level of reproducibility and collaboration. To make research fully reproducible one has to take care of several aspects: research protocol description, data access, environment preservation, workflow pipeline, and analysis script preservation. Version control systems like git help with the workflow and analysis scripts part. Virtualization techniques like Docker or Vagrant can help deal with environments. Jupyter notebooks are a powerful platform for conducting research in a collaborative manner. We present project Everware that seamlessly integrates git repository management systems such as Github or Gitlab, Docker and Jupyter helping with a) sharing results of real research and b) boosts education activities. With the help of Everware one can not only share the final artifacts of research but all the depth of the research process. This been shown to be extremely helpful during organization of several data analysis hackathons and machine learning schools. Using Everware participants could start from an existing solution instead of starting from scratch. They could start contributing immediately. Everware allows its users to make use of their own computational resources to run the workflows they are interested in, which leads to higher scalability of the toolkit.

072052
The following article is Open access

, , and

We investigate the problem of line detection in digital image processing and in special how state of the art algorithms behave in the presence of noise and whether CPU efficiency can be improved by the combination of a Monte Carlo Tree Search, hierarchical space decomposition, and parallel computing.

The starting point of the investigation is the method introduced in 1962 by Paul Hough for detecting lines in binary images. Extended in the 1970s to the detection of space forms, what came to be known as Hough Transform (HT) has been proposed, for example, in the context of track fitting in the LHC ATLAS and CMS projects. The Hough Transform transfers the problem of line detection, for example, into one of optimization of the peak in a vote counting process for cells which contain the possible points of candidate lines. The detection algorithm can be computationally expensive both in the demands made upon the processor and on memory. Additionally, it can have a reduced effectiveness in detection in the presence of noise.

Our first contribution consists in an evaluation of the use of a variation of the Radon Transform as a form of improving theeffectiveness of line detection in the presence of noise. Then, parallel algorithms for variations of the Hough Transform and the Radon Transform for line detection are introduced. An algorithm for Parallel Monte Carlo Search applied to line detection is also introduced. Their algorithmic complexities are discussed. Finally, implementations on multi-GPU and multicore architectures are discussed.

072053
The following article is Open access

and

In 2016 the NOvA experiment released results for the observation of oscillations in the vμ and ve channels as well as ve cross section measurements using neutrinos from Fermilab's NuMI beam. These and other measurements in progress rely on the accurate identification and reconstruction of the neutrino flavor and energy recorded by our detectors. This presentation describes the first application of convolutional neural network technology for event identification and reconstruction in particle detectors like NOvA. The Convolutional Visual Network (CVN) Algorithm was developed for identification, categorization, and reconstruction of NOvA events. It increased the selection efficiency of the ve appearance signal by 40% and studies show potential impact to the vμ disappearance analysis.