Table of contents

Volume 119

2008

Previous issue Next issue

DISTRIBUTED DATA ANALYSIS AND INFORMATION MANAGEMENT

Accepted papers received: 23 June 2008
Published online: 31 July 2008

PAPERS

072001
The following article is Open access

, , , , , , , and

The CMS Dataset Bookkeeping Service (DBS) has been developed to catalog all CMS event data from Monte Carlo and Detector sources. It provides the ability to identify MC or trigger source, track data provenance, construct datasets for analysis, and discover interesting data. CMS requires processing and analysis activities at various service levels and the DBS system provides support for localized processing or private analysis, as well as global access for CMS users at large. Catalog entries can be moved among the various service levels with a simple set of migration tools, thus forming a loose federation of databases. DBS is available to CMS users via a Python API, Command Line, and a Discovery web page interfaces. The system is built as a multi-tier web application with Java servlets running under Tomcat, with connections via JDBC to Oracle or MySQL database backends. Clients connect to the service through HTTP or HTTPS with authentication provided by GRID certificates and authorization through VOMS. DBS is an integral part of the overall CMS Data Management and Workflow Management systems.

072002
The following article is Open access

, , , , , , , , , et al

The present paper highlights the approach used to design and implement a web services based BaBar Monte Carlo (MC) production grid using Globus Toolkit version 4. The grid integrates the resources of two clusters at the University of Victoria, using the ClassAd mechanism provided by the Condor-G metascheduler. Each cluster uses the Portable Batch System (PBS) as its local resource management system (LRMS). Resource brokering is provided by the Condor matchmaking process, whereby the job and resource attributes are expressed as ClassAds. The important features of the grid are automatic registering of resource ClassAds to the central registry, ClassAds extraction from the registry to the metascheduler for matchmaking, and the incorporation of input/output file staging. Web-based monitoring is employed to track the status of grid resources and the jobs for an efficient operation of the grid. The performance of this new grid for BaBar jobs, and the existing Canadian computational grid (GridX1) based on Globus Toolkit version 2 is found to be consistent.

072003
The following article is Open access

, , and

AMI was chosen as the ATLAS dataset selection interface in July 2006. It is the main interface for searching for ATLAS data using physics metadata criteria. AMI has been implemented as a generic database management framework which allows parallel searching over many catalogues, which may have differing schema. The main features of the web interface will be described; in particular the powerful graphic query builder. The use of XML/XLST technology ensures that all commands can be used either on the web or from a command line interface via a web service. We also describe the overall architecture of ATLAS metadata and the different actors and granularity involved, and the place of AMI within this architecture. We discuss the problems involved in the correlation of metadata of differing granularity, and propose a solution for information mediation.

072004
The following article is Open access

, and

The concepts, design and evaluation of the Data Intensive and Network Aware (DIANA) meta-scheduling approach for solving the challenges of data analysis being faced by CERN experiments are discussed in this paper. Our results suggest that data analysis can be made robust by employing fault tolerant and decentralized meta-scheduling algorithms supported in our DIANA meta-scheduler. The DIANA meta-scheduler supports data intensive bulk scheduling, is network aware and follows a policy centric meta-scheduling. In this paper, we demonstrate that a decentralized and dynamic meta-scheduling approach is an effective strategy to cope with increasing numbers of users, jobs and datasets. We present 'quality of service' related statistics for physics analysis through the application of a policy centric fair-share scheduling model. The DIANA meta-schedulers create a peer-to-peer hierarchy of schedulers to accomplish resource management that changes with evolving loads and is dynamic and adapts to the volatile nature of the resources.

072005
The following article is Open access

, , and

Cyber security requirements for secure access to computing facilities often call for access controls via gatekeepers and the use of two-factor authentication. Using SSH keys to satisfy the two factor authentication requirement has introduced a potentially challenging task of managing the keys and their associations with individual users and user accounts. Approaches for a facility with the simple model of one remote user corresponding to one local user would not work at facilities that require a many-to-many mapping between users and accounts on multiple systems. We will present an SSH key management system we developed, tested and deployed to address the many-to-many dilemma in the environment of the STAR experiment. We will explain its use in an online computing context and explain how it makes possible the management and tracing of group account access spread over many sub-system components (data acquisition, slow controls, trigger, detector instrumentation, etc.) without the use of shared passwords for remote logins.

072006
The following article is Open access

, , and

Petascale systems are in existence today and will become common in the next few years. Such systems are inevitably very complex, highly distributed and heterogeneous. Monitoring a petascale system in real-time and understanding its status at any given moment without impacting its performance is a highly intricate task. Common approaches and off-the-shelf tools are either unusable, do not scale, or severely impact the performance of the monitored servers. This paper describes unobtrusive monitoring software developed at Stanford Linear Accelerator Center (SLAC) for a highly distributed petascale production data set. The paper describes the employed solutions, the lessons learned, the problems still to be addressed, and explains how the system can be reused elsewhere

072007
The following article is Open access

, , and

The CMS experiment at the LHC has established an infrastructure using the FroNTier framework to deliver conditions (i.e. calibration, alignment, etc.) data to processing clients worldwide. FroNTier is a simple web service approach providing client HTTP access to a central database service. The system for CMS has been developed to work with POOL which provides object relational mapping between the C++ clients and various database technologies. Because of the read only nature of the data, Squid proxy caching servers are maintained near clients and these caches provide high performance data access. Several features have been developed to make the system meet the needs of CMS including careful attention to cache coherency with the central database, and low latency loading required for the operation of the online High Level Trigger. The ease of deployment, stability of operation, and high performance make the FroNTier approach well suited to the GRID environment being used for CMS offline, as well as for the online environment used by the CMS High Level Trigger. The use of standard software, such as Squid and various monitoring tools, makes the system reliable, highly configurable and easily maintained. We describe the architecture, software, deployment, performance, monitoring and overall operational experience for the system.

072008
The following article is Open access

, , , , , , and

A precise alignment of Muon System is one of the requirements to fulfill the CMS expected performance to cover its physics program. A first prototype of the software and computing tools to achieve this goal has been successfully tested during the CSA06, Computing, Software and Analysis Challenge in 2006. Data was exported from Tier-0 to Tier-1 and Tier-2, where the alignment software was run. Re-reconstruction with new geometry files was also performed at remote sites. Performance and validation of the software has also been tested on cosmic data, taken during the MTCC in 2006.

072009
The following article is Open access

and

The high demanding computing needs of the LHCb experiment are fulfilled by an extensive use of the Grid resources. Although these are wide and growing, they still remain finite. This paper addresses how all LHCb users can fairly access these resources and execute their tasks in an order determined by identity, group, job type and accounting information.

072010
The following article is Open access

The LHCb Conditions Database project provides the necessary tools to handle non-event time-varying data. The main users of conditions are reconstruction and analysis processes, which are running on the Grid. To allow efficient access to the data, we need to use a synchronized replica of the content of the database located at the same site as the event data file, i.e. the LHCb Tier1. The replica to be accessed is selected from information stored on LFC (LCG File Catalog) and managed with the interface provided by the LCG developed library CORAL. The plan to limit the submission of jobs to those sites where the required conditions are available will also be presented.

LHCb applications are using the Conditions Database framework on a production basis since March 2007. We have been able to collect statistics on the performance and effectiveness of both the LCG library COOL (the library providing conditions handling functionalities) and the distribution framework itself. Stress tests on the CNAF hosted replica of the Conditions Database have been performed and the results will be summarized here.

072011
The following article is Open access

, , , , and

A Data Skimming Service (DSS) is a site-level service for rapid event filtering and selection from locally resident datasets based on metadata queries to associated 'tag' databases. In US ATLAS, we expect most if not all of the AOD-based datasets to be replicated to each of the five Tier 2 regional facilities in the US Tier 1 'cloud' coordinated by Brookhaven National Laboratory. Entire datasets will consist of on the order of several terabytes of data, and providing easy, quick access to skimmed subsets of these data will be vital to physics working groups. Typically, physicists will be interested in portions of the complete datasets, selected according to event-level attributes (number of jets, missing Et, etc) and content (specific analysis objects for subsequent processing). In this paper we describe methods used to classify data (metadata tag generation) and to store these results in a local database. Next we discuss a general framework which includes methods for accessing this information, defining skims, specifying event output content, accessing locally available storage through a variety of interfaces (SRM, dCache/dccp, gridftp), accessing remote storage elements as specified, and user job submission tools through local or grid schedulers. The advantages of the DSS are the ability to quickly 'browse' datasets and design skims, for example, pre-adjusting cuts to get to a desired skim level with minimal use of compute resources, and to encode these analysis operations in a database for re-analysis and archival purposes. Additionally the framework has provisions to operate autonomously in the event that external, central resources are not available, and to provide, as a reduced package, a minimal skimming service tailored to the needs of small Tier 3 centres or individual users.

072012
The following article is Open access

, , , and

The ATLAS TAG Database is a multi-terabyte event-level metadata selection system, intended to allow discovery, selection of and navigation to events of interest to an analysis. The TAG Database encompasses file- and relational-database-resident event-level metadata, distributed across all ATLAS Tiers. An oracle hosted global TAG relational database, containing all ATLAS events, implemented in Oracle, will exist at Tier O. Implementing a system that is both performant and manageable at this scale is a challenge. A 1 TB relational TAG Database has been deployed at Tier 0 using simulated tag data. The database contains one billion events, each described by two hundred event metadata attributes, and is currently undergoing extensive testing in terms of queries, population and manageability. These 1 TB tests aim to demonstrate and optimise the performance and scalability of an Oracle TAG Database on a global scale. Partitioning and indexing strategies are crucial to well-performing queries and manageability of the database and have implications for database population and distribution, so these are investigated. Physics query patterns are anticipated, but a crucial feature of the system must be to support a broad range of queries across all attributes. Concurrently, event tags from ATLAS Computing System Commissioning distributed simulations are accumulated in an Oracle-hosted database at CERN, providing an event-level selection service valuable for user experience and gathering information about physics query patterns. In this paper we describe the status of the Global TAG relational database scalability work and highlight areas of future direction.

072013
The following article is Open access

, , , and

The CMS Dataset Bookkeeping System (DBS) search page is a web-based application used by physicists and production managers to find data from the CMS experiment. The main challenge in the design of the system was to map the complex, distributed data model embodied in the DBS and the Data Location Service (DLS) to a simple, intuitive interface consistent with the mental model of physicists analyzing the data. We used focus groups and user interviews to establish the required features. The resulting interface addresses the physicist and production manager roles separately, offering both a guided search structured for the common physics use cases as well as a dynamic advanced query interface.

072014
The following article is Open access

, , , , , , , , , et al

The distributed data analysis using Grid resources is one of the fundamental applications in high energy physics to be addressed and realized before the start of LHC data taking. The need to facilitate the access to the resources is very high. In every experiment up to a thousand physicist will be submitting analysis jobs into the Grid. Appropriate user interfaces and helper applications have to be made available to assure that all users can use the Grid without too much expertise in Grid technology. These tools enlarge the number of grid users from a few production administrators to potentially all participating physicists.

The GANGA job management system (http://cern.ch/ganga), developed as a common project between the ATLAS and LHCb experiments provides and integrates these kind of tools. GANGA provides a simple and consistent way of preparing, organizing and executing analysis tasks within the experiment analysis framework, implemented through a plug-in system. It allows trivial switching between running test jobs on a local batch system and running large-scale analyzes on the Grid, hiding Grid technicalities.

We will be reporting on the plug-ins and our experiences of distributed data analysis using GANGA within the ATLAS experiment and the EGEE/LCG infrastructure. The integration with the ATLAS data management system DQ2 into GANGA is a key functionality. In combination with the job splitting mechanism large amounts of jobs can be sent to the locations of data following the ATLAS computing model. GANGA supports tasks of user analysis with reconstructed data and small scale production of Monte Carlo data.

072015
The following article is Open access

, , , , , , , , , et al

The Tracker detector took data with cosmics rays at the Tracker Integration Facility (TIF) at CERN. First on-line monitoring tasks were executed at the Tracker Analysis Centre (TAC) which is a dedicated Control Room at TIF with limited computing resources. A set of software agents were developed to perform the real-time data conversion in a standard format, to archive data on tape at CERN and to publish them in the official CMS data bookkeeping systems. According to the CMS computing and analysis model, most of the subsequent data processing has to be done in remote Tier-1 and Tier-2 sites, so data were automatically transferred from CERN to the sites interested to analyze them, currently Fermilab, Bari and Pisa. Official reconstruction in the distributed environment was triggered in real-time by using the tool currently used for the processing of simulated events. Automatic end-user analysis of data was performed in a distributed environment, in order to derive the distributions of important physics variables. The tracker data processing is currently migrating to the Tier-0 CERN as a prototype for the global data taking chain. Tracker data were also registered into the most recent version of the data bookkeeping system, DBS-2, by profiting from the new features to handle real data. A description of the dataflow/workflow and of the tools developed is given, together with the results about the performance of the real-time chain. Almost 7.2 million events were officially registered, moved, reconstructed and analyzed in remote sites by using the distributed environment.

072016
The following article is Open access

and

High Energy Physics data processing and analysis applications typically deal with the problem of accessing and processing data at high speed. Recent studies, development and test work have shown that the latencies due to data access can often be hidden by parallelizing them with the data processing, thus giving the ability to have applications which process remote data with a high level of efficiency.

Techniques and algorithms able to reach this result have been implemented in the client side of the Scalla/xrootd system, and in this contribution we describe the results of some tests done in order to compare their performance and characteristics. These techniques, if used together with multiple streams data access, can also be effective in allowing to efficiently and transparently deal with data repositories accessible via a Wide Area Network.

072017
The following article is Open access

ALICE (A Large Ion Collider Experiment) at the LHC plans to use a PROOF cluster at CERN (CAF - CERN Analysis Facility) for analysis. The system is especially aimed at the prototyping phase of analyses that need a high number of development iterations and thus require a short response time. Typical examples are the tuning of cuts during the development of an analysis as well as calibration and alignment. Furthermore, the use of an interactive system with very fast response will allow ALICE to extract physics observables out of first data quickly. An additional use case is fast event simulation and reconstruction. A test setup consisting of 40 machines is used for evaluation since May 2006. The PROOF system enables the parallel processing and xrootd the access to files distributed on the test cluster. An automatic staging system for files either catalogued in the ALICE file catalog or stored in the CASTOR mass storage system has been developed. The current setup and ongoing development towards disk quotas and CPU fairshare are described. Furthermore, the integration of PROOF into ALICE's software framework (AliRoot) is discussed.

072018
The following article is Open access

, , and

Modern Macintosh computers feature Xgrid, a distributed computing architecture built directly into Apple's OS X operating system. While the approach is radically different from those generally expected by the Unix based Grid infrastructures (Open Science Grid, TeraGrid, EGEE), opportunistic computing on Xgrid is nonetheless a tempting and novel way to assemble a computing cluster with a minimum of additional configuration. In fact, it requires only the default operating system and authentication to a central controller from each node. OS X also implements arbitrarily extensible metadata, allowing an instantly updated file catalog to be stored as part of the filesystem itself. The low barrier to entry allows an Xgrid cluster to grow quickly and organically. This paper and presentation will detail the steps that can be taken to make such a cluster a viable resource for HENP research computing. We will further show how to provide to users a unified job submission framework by integrating Xgrid through the STAR Unified Meta-Scheduler (SUMS), making tasks and jobs submission effortlessly at reach for those users already using the tool for traditional Grid or local cluster job submission. We will discuss additional steps that can be taken to make an Xgrid cluster a full partner in grid computing initiatives, focusing on Open Science Grid integration. MIT's Xgrid system currently supports the work of multiple research groups in the Laboratory for Nuclear Science, and has become an important tool for generating simulations and conducting data analyses at the Massachusetts Institute of Technology.

072019
The following article is Open access

, , , , and

Facing the reality of storage economics, NP experiments such as RHIC/STAR have been engaged in a shift of the analysis model, and now heavily rely on using cheap disks attached to processing nodes, as such a model is extremely beneficial over expensive centralized storage. Additionally, exploiting storage aggregates with enhanced distributed computing capabilities such as dynamic space allocation (lifetime of spaces), file management on shared storages (lifetime of files, pinning file), storage policies or a uniform access to heterogeneous storage solutions is not an easy task.

The Xrootd/Scalla system allows for storage aggregation. We will present an overview of the largest deployment of Scalla (Structured Cluster Architecture for Low Latency Access) in the world spanning over 1000 CPUs co-sharing the 350 TB Storage Elements and the experience on how to make such a model work in the RHIC/STAR standard analysis framework. We will explain the key features and approach on how to make access to mass storage (HPSS) possible in such a large deployment context.

Furthermore, we will give an overview of a fully 'gridified' solution using the plug-and-play features of Scalla architecture, replacing standard storage access with grid middleware SRM (Storage Resource Manager) components designed for space management and will compare the solution with the standard Scalla approach in use in STAR for the past 2 years. Integration details, future plans and status of development will be explained in the area of best transfer strategy between multiple-choice data pools and best placement with respect of load balancing and interoperability with other SRM aware tools or implementations.

072020
The following article is Open access

and

The ATLAS Computing Model was constructed after early tests and was captured in the ATLAS Computing TDR in June 2005. Since then, the grid tools and services have evolved and their performance is starting to be understood through large-scale exercises. As real data taking becomes immanent, the computing model continues to evolve, with robustness and reliability being the watchwords for the early deployment. Particular areas of active development are the data placement and data access, and the interaction between the TAGs, the datasets and the Distributed Data Management issues. The earlier high-level policies and models are now being refined into lower level instantiations.

072021
The following article is Open access

Ganga, the job-management system (http://cern.ch/ganga), developed as an ATLAS- LHCb common project, offers a simple, efficient and consistent user experience in a variety of heterogeneous environments: from local clusters to global Grid systems. GANGA helps end-users to organise their analysis activities on the Grid by providing automatic persistency of the job's metadata. A user has full access to the job history including their configuration and input/output. It is however important that users can see a single environment for developing and testing algorithms locally and for running on large data samples on the Grid. The tool allows for some basic monitoring and a steadily increasing number of users of more than 300 users have been confirmed, both in HEP, as well as in non-HEP applications. The paper will introduce the GANGA philosophy, the GANGA architecture and current and future strategy.

072022
The following article is Open access

, and

Using the gLitePROOF package it is possible to perform PROOF-based distributed data analysis on the gLite Grid. The LHC experiments managed to run globally distributed Monte Carlo productions on the Grid, now the development of tools for data analysis is in the foreground. To grant access interfaces must be provided. The ROOT/PROOF framework is used as a starting point. Using abstract ROOT classes (TGrid, ...) interfaces can be implemented, via which Grid access from ROOT can be accomplished. A concrete implementation exists for the ALICE Grid environment AliEn. Within the D-Grid project an interface to the common Grid middleware of all LHC experiments, gLite, has been created. Therefore it is possible to query Grid File Catalogues from ROOT for the location of the data to be analysed. Grid jobs can be submitted into a gLite based Grid. The status of the jobs can be asked for, and their results can be obtained.

072023
The following article is Open access

, , , , , , , , , et al

The worldwide computing grid is essential to the LHC experiments in analysing the data collected by the detectors. Within LHCb, the computing model aims to simulate data at Tier-2 grid sites as well as non-grid resources. The reconstruction, stripping and analysis of the produced LHCb data will pimarily place at the Tier-1 centres. The computing data challenge DC06 started in May 2006 with the primary aims being to exercise the LHCb computing mod and to produce events which will be used for analyses in the forthcoming LHCb physics book. This paper gives an overview of the LHCb computing model and addresses the challenges and experiences during DC06. The management of the production of Monte Carlo data on the LCG was done using the DIRAC worklad management system which in turn uses the WLCG infrastructure and middleware. We shall report on the amount of data simulated during DC06, including the performance of the sites used. The paper will also summarise the experience gained during DC06, in particular he distribution of data to the Ter-1 sits and the access to this data.

072024
The following article is Open access

In Germany, several university institutes and research centres take part in the CMS experiment. Concerning the data analysis, a couple of computing centres at different Tier levels, ranging from Tier 1 to Tier 3, exists at these places. The German Tier 1 centre GridKa at the research centre at Karlsruhe serves all four LHC experiments as well as four non-LHC experiments. With respect to the CMS experiment, GridKa is mainly involved in central tasks. The Tier 2 centre in Germany consists of two sites, one at the research centre DESY at Hamburg and one at RWTH Aachen University, forming a federated Tier 2 centre. Both parts cover different aspects of a Tier 2 centre. The German Tier 3 centres are located at the research centre DESY at Hamburg, at RWTH Aachen University, and at the University of Karlsruhe. Furthermore the building of a German user analysis facility is planned. Since the CMS community in German is rather small, a good cooperation between the different sites is essential. This cooperation includes physical topics as well as technical and operational issues. All available communication channels such as email, phone, monthly video conferences, and regular personal meetings are used. For example, the distribution of data sets is coordinated globally within Germany. Also the CMS-specific services such as the data transfer tool PhEDEx or the Monte Carlo production are operated by people from different sites in order to spread the knowledge widely and increase the redundancy in terms of operators.

072025
The following article is Open access

, , , and

The upgrades of the Tevatron collider and CDF detector have considerably increased the demand on computing resources, in particular for Monte Carlo production. This has forced the collaboration to move beyond the usage of dedicated resources and start exploiting the Grid. The CDF Analysis Farm (CAF) model has been reimplemented into LcgCAF in order to access Grid resources by using the LCG/EGEE middleware. Many sites in Italy and in Europe are accessed through this portal by CDF users mainly to produce Monte Carlo data but also for other analysis jobs. We review here the setup used to submit jobs to Grid sites and retrieve the output, including CDF-specific configuration of some Grid components. We also describe the batch and interactive monitor tools developed to allow users to verify the jobs status during their lifetime in the Grid environment. Finally we analyze the efficiency and typical failure modes of the current Grid infrastructure reporting the performances of different parts of the system used.

072026
The following article is Open access

and

The LHCb distributed data analysis system consists of the Ganga job submission front-end and the DIRAC Workload and Data Management System (WMS). Ganga is jointly developed with ATLAS and allows LHCb users to submit jobs on several backends including: several batch systems, LCG and DIRAC. The DIRAC API provides a transparent and secure way for users to run jobs to the Grid and is the default mode of submission for the LHCb Virtual Organisation (VO). This is exploited by Ganga to perform distributed user analysis for LHCb. This system provides LHCb with a consistent, efficient and simple user experience in a variety of heterogeneous environments and facilitates the incremental development of user analysis from local test jobs to the Worldwide LHC Computing Grid. With a steadily increasing number of users, the LHCb distributed analysis system has been tuned and enhanced over the past two years. This paper will describe the recent developments to support distributed data analysis for the LHCb experiment on WLCG.

072027
The following article is Open access

, , , , , and

The ATLAS Distributed Data Management (DDM) system is evolving to provide a production-quality service for data distribution and data management support for production and users' analysis.

Monitoring the different components in the system has emerged as one of the key issues to achieve this goal. Its distributed nature over different grid infrastructures (EGEE, OSG and NDGF) with infrastructure-specific data management components makes the task particularly challenging. Providing simple views over the status of the DDM components and data to users and site administrators is essential to effectively operate the system under realistic conditions.

In this paper we present the design of the DDM monitor system, the information flow, data aggregation. We discuss the available usage, the interactive functionality for end-users and the alarm system.

072028
The following article is Open access

and

For three detector components of the KASCADE-Grande experiment, WEB based online event displays have been implemented. They provide, in a fast and simplified way, actual information about energy deposits and arrival times of measured events, and the overall detector status. Besides the aspect of being able to show air shower events to interested people wherever there is an internet access available, these event displays are an easy and highly useful tool for controlling and maintaining tasks from remote places. The event displays are designed as client-server applications, with the server running as independent part of the local data acquisition. Simplified event data are distributed via socket connections directly to the java applets acting as clients. These clients can run in any common browser on any computer anywhere on the planet.

072029
The following article is Open access

, , and

The CMS experiment is about to embark on its first physics run at the LHC. To maximize the effectiveness of physicists and technical experts at CERN and worldwide and to facilitate their communications, CMS has established several dedicated and inter-connected operations and monitoring centres. These include a traditional 'Control Room' at the CMS site in France, a 'CMS Centre' for up to fifty people on the CERN main site in Switzerland, and remote operations centres, such as the 'LHC@FNAL' centre at Fermilab. We describe how this system of centres coherently supports the following activities: (1) CMS data quality monitoring, prompt sub-detector calibrations, and time-critical data analysis of express-line and calibration streams; and (2) operation of the CMS computing systems for processing, storage and distribution of real CMS data and simulated data, both at CERN and at offsite centres. We describe the physical infrastructure that has been established, the computing and software systems, the operations model, and the communications systems that are necessary to make such a distributed system coherent and effective.

072030
The following article is Open access

, , , , , and

The CMS experiment will need to sustain uninterrupted high reliability, high throughput and very diverse data transfer activities as the LHC operations start. PhEDEx, the CMS data transfer system, will be responsible for the full range of the transfer needs of the experiment. Covering the entire spectrum is a demanding task: from the critical high-throughput transfers between CERN and the Tier-1 centres, to high-scale production transfers among the Tier-1 and Tier-2 centres, to managing the 24/7 transfers among all the 170 institutions in CMS and to providing straightforward access to handful of files to individual physicists.

In order to produce the system with confirmed capability to meet the objectives, the PhEDEx data transfer system has undergone rigourous development and numerous demanding scale tests. We have sustained production transfers exceeding 1 PB/month for several months and have demonstrated core system capacity several orders of magnitude above expected LHC levels.

We describe the level of scalability reached, and how we got there, with focus on the main insights into developing a robust, lock-free and scalable distributed database application, the validation stress test methods we have used, and the development and testing tools we found practically useful.

072031
The following article is Open access

and

In preparation for ATLAS data taking, a coordinated shift from development towards operations has occurred in ATLAS database activities. In addition to development and commissioning activities in databases, ATLAS is active in the development and deployment (in collaboration with the WLCG 3D project) of the tools that allow the worldwide distribution and installation of databases and related datasets, as well as the actual operation of this system on ATLAS multi-grid infrastructure. We describe development and commissioning of major ATLAS database applications for online and offline. We present the first scalability test results and ramp-up schedule over the initial LHC years of operations towards the nominal year of ATLAS running, when the database storage volumes are expected to reach 6.1 TB for the Tag DB and 1.0 TB for the Conditions DB. ATLAS database applications require robust operational infrastructure for data replication between online and offline at Tier-0, and for the distribution of the offline data to Tier-1 and Tier-2 computing centers. We describe ATLAS experience with Oracle Streams and other technologies for coordinated replication of databases in the framework of the WLCG 3D services.