Open Data from the Third Observing Run of LIGO, Virgo, KAGRA, and GEO

The global network of gravitational-wave observatories now includes ﬁ ve detectors, namely LIGO Hanford, LIGO Livingston, Virgo, KAGRA, and GEO 600. These detectors collected data during their third observing run, O3, composed of three phases: O3a starting in 2019 April and lasting six months, O3b starting in 2019 November and lasting ﬁ ve months, and O3GK starting in 2020 April and lasting two weeks. In this paper we describe these data and various other science products that can be freely accessed through the Gravitational Wave Open Science Center at https: // gwosc.org. The main data set, consisting of the gravitational-wave strain time series that contains the astrophysical signals, is released together with supporting data useful for their analysis and documentation, tutorials, as well as analysis software packages.


INTRODUCTION
Gravitational-wave (GW) detectors develop through successive generations of instruments with increasing sensitivity (Abbott et al. 2020a).The US-based Advanced LIGO 1 detectors (Aasi et al. 2015) were the first two instruments of the current generation to begin operation, collecting data during the first observing run (O1) from September 2015 to January 2016, including the first direct detection of gravitational waves (Abbott et al. 2016).The second observing run (O2) followed from November 2016 to August 2017, with the European detector Advanced Virgo (Acernese et al. 2015) joining in August 2017.The GEO 600 detector in Germany (Dooley et al. 2016) serves as a center of research and development, and is used to test a number of critical detector technologies.Another GW detector, the Japan-based KAGRA (Akutsu et al. 2021), has also been rapidly developing.
This article focuses on the data collected during the third observing run, O3, that took place from April 1 2019 to April 21 2020.The bulk of this observing run collected data only from LIGO and Virgo, and is divided into two main operational phases: O3a from April 1 2019 to October 1 2019, and O3b from November 1 2019 to March 27 2020, with a one-month maintenance break between the two phases.KAGRA was expected to join O3, but this initial plan changed due to the outbreak of COVID-19.Instead, KAGRA and GEO 600 operated during an extended observing phase, O3GK, from April 7 to April 21 2020 (Abbott et al. 2022a).
The analysis of the O3 data has led to numerous publications.Those include several updates to the GWTC (GW Transient Catalog; Abbott et al. 2021a,b,c) that compiles transient sources analyzed and reported by the combined LIGO-Virgo-KAGRA Collaboration (LVK).The cumulative GWTC catalog currently includes nearly 100 candidate sources (with a probability of astrophysical origin > 50 %), all associated with the coalescence of compact star binaries composed of either neutron stars, black holes, or both.
Following the policy defined in the LIGO Data Management Plan (LIGO Laboratory 2022a) and a Memorandum of Understanding (LIGO Scientific Collaboration and Virgo Collaboration 2019), the O3 data set and associated science products are published through the Gravitational-Wave Open Science Center (GWOSC) at https://gwosc.org2 allowing the reproducibility of the analyses performed by the LVK and increasing the impact of the data through its wider use.This paper provides a description of the publicly released data (LIGO Scientific Collaboration and Virgo Collaboration 2021a,b; LIGO Scientific Collaboration, Virgo Collaboration and KAGRA Collaboration 2022a) along with additional information on their usage.
To date hundreds of scientific articles have been written using the data available from the GWOSC website (all datasets combined)3 .These analyses confirm, complement, and extend the results published by the LVK Collaboration, demonstrating the impact on the scientific community of the GW data releases.
This paper is organized as follows.Section 2 summarizes the status of the detectors during the observing run O3, together with high-level indicators such as their distance reach and duty cycle of operation.This section also provides insights about how the data are collected and calibrated, about data quality and about simulated signal injections.Section 3 describes the format, content and provenance of the strain data files distributed through the GWOSC, including the nomenclature used for the calibration versions and channel names.Section 4 describes the Event Portal, a searchable GW event database accessible online.Details about the technical validation and review of the data and documentation are given in Section 5. Finally, Section 6 provides some guiding principles to the novice user and suggests software tools that can be used to analyze the data.

INSTRUMENTS
The Advanced LIGO (Aasi et al. 2015) and Advanced Virgo (Acernese et al. 2015) detectors are enhanced Michelson interferometers with arm lengths of 4 km and 3 km, respectively.Advanced LIGO comprises two detectors located at two different sites in the US, namely, in Hanford, WA and Livingston LA, while Advanced Virgo has a single site in Cascina, close to Pisa, Italy.The various instrument upgrades realized between the science runs O2 and O3 for the LIGO and Virgo detectors are described in (Buikema et al. 2020;Abbott et al. 2021a,c;Acernese et al. 2022a).They involve many parts of the instruments, including the main laser source and the core optics along with the installation of mitigation systems for a range of technical noises.One of the major novelties in O3 both for LIGO and Virgo is the use of squeezed light sources (see Tse et al. (2019) for LIGO and Acernese et al. (2019) for Virgo), a technique (Schnabel et al. 2010;Barsotti et al. 2019) that significantly reduces quantum noise, thus enhancing the sensitivity at high frequency.
GEO 600 (Dooley et al. 2016) is a British-German interferometric GW detector with 600 m arms located near Hannover, Germany.As in LIGO and Virgo, quantum squeezing is used to reduce noise in the output measurement quadrature (Lough et al. 2021).This technique was first demonstrated by GEO 600 (Abadie et al. 2011).KAGRA is an underground laser interferometer with 3-km arms, located underground at the Kamioka Observatory in Gifu Prefecture, Japan.An important feature of its design is the cooling system intended to bring the large mirrors of the interferometer to cryogenic temperature (around 20 K) in order to reduce thermal noise (Akutsu et al. 2016;Chen et al. 2014).During the O3GK run however, the detector was operated at room temperature (Akutsu et al. 2018(Akutsu et al. , 2021)).

Detectors performance
A GW detector's performance is often globally characterized by two measures: its duty factor, defined as the fraction of time the detector is recording observational quality data, and its distance reach, conventionally measured as the binary neutron star (BNS) inspiral range (Finn & Chernoff 1993;Chen et al. 2021), the distance to which a BNS inspiral could be detected with signal-to-noise ratio of 8, assuming 1.4 solar mass component objects and averaging over source position and orientation.The choice of this metric is a standard convention.The value of 1.4 solar masses is close to the measured masses of the stars in the Hulse-Taylor binary (Weisberg & Huang 2016) and within the narrow range predicted by stellar evolution for neutron-star masses.The distance reach of the detectors strongly depends on the source mass.For example, binary black-hole (BBH) systems can typically be detected at much greater distances, up to several Gpc (e.g., Abbott et al. 2021c, Table IV).
The GWOSC website hosts summary pages for O3a4 and O3b5 which describe the LIGO and Virgo operations and sensitivity.The duty factors during O3a are 71% for LIGO Hanford (H1), 76% for LIGO Livingston (L1) and 76% for

Network duty factor for O3a
No detector [3.Virgo (V1).During O3b, the corresponding percentages are 79%, 79% and 76%, respectively.Those translate into the observing factors shown in Fig. 1 that quantify the fraction of observing time spent with one, two or three instruments in operation.
During the O3GK run, the duty factors of KAGRA (K1) and GEO 600 (G1) are 53% and 80% respectively, leading to a coincident observing factor of 47% (Abbott et al. 2022a).The lower duty cycle of KAGRA is due to the fact that alignment sensing and control with wavefront sensors was not yet implemented at the time of the run, leading to a higher susceptibility to microseismic ground vibrations.
The median values of the BNS range over the whole observing run are 108 Mpc, 135 Mpc and 45 Mpc for H1, L1 and V1 respectively during O3a, and 115 Mpc, 133 Mpc and 51 Mpc during O3b for the same detectors.The median values of the BNS range over the O3GK period are 0.66 Mpc for KAGRA and 1.06 Mpc for GEO 600.Fig. 2 displays the median BNS range computed over regular intervals (5-minute scale for LIGO and Virgo and 20-minute scale for GEO 600 and KAGRA).The drops that can be observed in both plots are due to transient noise artifacts (discussed in Sec.2.3) reducing the detector sensitivity temporarily.The BNS range shown in the recent GWTC publications such as Abbott et al. (2021a) (Fig. 3) and Abbott et al. (2021c) (Fig. 3) are averaged over a longer period (1 hour) and are thus less affected by transient noise.The longer gaps in the BNS inspiral range are due to maintenance intervals, instrumental issues, and earthquakes.

Calibration
The GW strain h(t) is obtained and calibrated from variations of the optical power measured at the output port of each detector.The calibration procedure and the corresponding characterization of the systematic and statistical uncertainties are described in Viets et al. (2018); Sun et al. (2020Sun et al. ( , 2021) ) for Advanced LIGO and Acernese et al. (2022b) for Advanced Virgo.Calibration is performed in two stages: an initial, online calibration used for low-latency analysis, and a final, offline calibration that applies any needed corrections to the initial result.The offline calibration may correct for computer failures, incomplete modelling of the detectors, or any systematic errors characterized after the observing period.The uncertainties in the calibration procedure for both the magnitude and phase of h(t) as a function of frequency are documented (LIGO Scientific Collaboration & Virgo Collaboration 2021).
The calibration process also includes a noise subtraction step based on independent measurements of a range of noise sources by witness sensors, as described in Davis et al. (2019); Vajente et al. (2020a); Mukund et al. (2020); Estevez et al. (2019); and Acernese et al. (2022b).For the last two weeks of O3a, the Virgo data were reprocessed with a new configuration of the noise subtraction (Rolland et al. 2019;Acernese et al. 2022b) so a different calibration is available just for this period (see Table 1).
GWOSC releases two types of strain data: bulk data spanning an entire observing run, and smaller data snippets around the time of each GW event.Data snippets are based on the calibration version available at the time of publication of the related GW event.Events that appear in multiple publications may have multiple data snippets available, sometimes with different calibration versions.Naturally, the time segments released as data snippets are also available in the bulk data set, but the bulk data of the entire O3a, O3b and O3GK observation runs provided through GWOSC correspond to the final (most up-to-date) calibration.These differences in calibration can lead to discrepancies between the data snippets and the corresponding data in the bulk data release, potentially leading in turn to differences in the source parameter values that can be estimated from the data.However, as discussed in Sec.3.3, in addition to the main bulk data release, several alternate strain channels with different calibration versions are also made public.The detector strain h(t) in O3 is calibrated only between 10 Hz and 5000 Hz for Advanced LIGO, between 20 Hz and 2000 Hz for Advanced Virgo, between 30 Hz and 1500 Hz for KAGRA and between 40 Hz and 6000 Hz for GEO 600.Any apparent signal outside these ranges cannot be trusted because it is not a faithful representation of the GW strain at these frequencies.In addition, Advanced Virgo data between 49.5 Hz and 50.5 Hz are characterised by a large increase of calibration errors because of effects related to the mains power lines (Acernese et al. 2022b).Because of this increased systematic error, data in this narrow frequency band were considered to be uninformative for source-parameter estimation (see Appendix E of Abbott et al. (2021c) for relevant methods).

Detector noise characterization and data quality
The data are dominated by instrumental noise that can be well described as Gaussian and stationary over limited time scales and frequency ranges.The data also contain intermittent short-duration noise artifacts, or glitches, that contribute to the noise background as well.Any analysis of GW data must account for the presence of these various noise components (see Sec. 6 for more information about using the data).A summary of efforts to characterize data quality in O3 can be seen in Davis et al. (2021) for Advanced LIGO, Acernese et al. (2022a) for Advanced Virgo, and Abe et al. (2022) for KAGRA.The overall quality of data for transient searches is recorded as data quality segments, described in more detail in Sec.3.2.

Signal injections
Hardware injections are simulated GW signals added by physically displacing the test masses (i.e. the interferometer mirrors) (Biwer et al. 2017).The simulated signal initiates a response that mimics that of a true GW.By looking for discrepancies between the injected and recovered signals, it is possible to characterize the performance of analyses and the coupling of instrumental subsystems to the detectors' output channels.
During the third observing run O3, hardware injections were performed in the Advanced LIGO and Advanced Virgo detectors.The record of all injections is available through GWOSC web pages. 6This list is provided to prevent potential confusion with an actual astrophysical signal.For Virgo, those injections were removed post-facto when producing the calibrated strain (see Acernese et al. (2022b) for details on this subtraction), so the injection times are not marked in the GWOSC files.On the other hand, in the case of Advanced LIGO the injections are still present in the calibrated data, and their times are marked in the GWOSC files (see Sec. 3.2).
No injections were performed during O3GK.).Small batches of files can be conveniently downloaded from the GWOSC website directly.7However, when downloading large amounts of data (such as an entire observing run) the use of the distributed file system CernVM-FS (Weitzel et al. 2017) is recommended.8Once configured, CernVM-FS allows access to all GWOSC data locally on the user's computer.
The O3 calibrated strain data are distributed in files that contain 4096 seconds of data.Published GW signals are also released in separate files containing data snippets of 4096 seconds or 32 seconds, centered on the event's detection time and released under the GWOSC Event Portal. 9The description of the data records that follows is valid both for single event releases and for bulk data releases.
GWOSC calibrated strain data are repackaged from data stored in the LVK archives.The data source is uniquely identified by a channel name and a frame type (see Table 1).At times when data are unavailable or of quality too poor to be analyzed, the strain values are represented with NaNs.Strain data are made available both at the sampling rate of 16384 Hz, and at a downsampled rate of 4096 Hz10 .Down-sampling is achieved using the standard decimation method implemented in scipy.signal.decimate11from the Python package SciPy (Virtanen et al. 2020).The highest frequency available is determined by the Nyquist-Shannon sampling theorem (Nyquist 1924), and is equal to half the sampling rate specified in a particular dataset.This is an important consideration to keep in mind when deciding which sample rate to download from GWOSC.Because the anti-aliasing filters used in resampling roll-off at the upper end of the working frequency interval, the valid frequency range is reduced to a bit less than the Nyquist frequency.So, for the 4 kHz data the maximum usable frequency is approximately 1700 Hz.Higher sample rate data will require more hard-drive space to store and longer times to download.The user can decide which dataset meets their needs.K1:DAC-STRAIN_C20 K1_HOFT_C20

GWOSC file formats
The GWOSC open data are delivered in two different file formats: hdf and gwf.The Hierarchical Data Format hdf (Koziol & Robinson 2018) is a portable data format readable by many programming languages.The Frame format gwf (LIGO Scientific Collaboration and Virgo Collaboration 2009) is a specialized format used by the gravitational wave community.Data associated with GW events are also released as plain text files containing two columns with the global positioning system (GPS) time in the first column and the corresponding strain value in the second column.
For both formats the file naming follows the naming convention, obs-FrameType-GPSstart-duration.extension where FrameType for the main O3 data release is ifo_GWOSC_ObservationRun_sKHZ_Rn and • obs is the observatory, i.e. the site, so can have values L, H, V, G or K; • ifo is the interferometer and can have values H1, L1, V1, G1 or K1; • ObservationRun encodes the observing run name, so in this case is O3a, O3b, or O3GK; • s is the sampling rate in kHz with either a value 4 or 16 (4096 Hz or 16384 Hz); • n is the version number of the file (typically 1); • GPSstart is the starting time of the data contained in the file, as a 10-digit GPS value (in seconds); • duration is the duration in seconds of the file, typically either 4096 or 32 seconds; • and extension represents the file format and can be gwf or hdf.
The folders (or groups) included in the hdf files are: • meta: metadata of the file containing the following fields: -Description, e.g."Strain data time series from LIGO", -DescriptionURL: URL of the GWOSC website, -Detector, e.g.L1, and Observatory, e.g.L,12 -Duration, GPSstart, UTCstart: duration and starting time (using GPS and UTC standards, respectively) of the segment of data contained in a file.
-StrainChannel : channel name used in the LVK archives -FrameType: frame type used in the LVK archives • strain: array of h(t), sampled at 4 or 16 kHz depending on the file.For the times when the detector is not in science mode or the data does not meet the minimum required data quality conditions (see next section), the strain values are set to NaNs.The strain h(t) is a function of time, so it is accompanied by the attributes Xstart and Xspacing defining the starting GPS time of the data contained in the array and the corresponding distance in time between the points of the array.
• quality: this folder contains two sub-folders, one for data quality and the other for injections, each including a bitmask to indicate at each second the status of the data quality or the injections and the description of each bit of the mask (see Section 3.2 for details).
The gwf files have a similar content but with a different structure.They contain 3 channels, one for the strain data, one for the data quality and one for the injections.The channel names are described in Table 2.The original files produced internally, whose channel names are listed in Table 1, contain only the strain channel, while the GWOSC files conveniently combine the strain data with the data quality and injection information in the same file.
Table 2. Channel names of the GWOSC frame files (format gwf).In this nomenclature, ifo is a place holder for the interferometer name, i.e.H1, L1, V1, G1 or K1, and s = 4 or 16 kHz denotes the sampling rate.The R1 sub-string represents the revision number of the channel name so it will become R2 in case there is a second (revised) release, and so on.

Data quality and injections in GWOSC files
Several types of searches are performed on the LIGO, Virgo, GEO 600, and KAGRA data.Those searches are divided into four families named after the types of signals they target: Compact binary coalescences (CBC), GW bursts (BURST), continuous waves (CW) and stochastic backgrounds (STOCH).As each type of search has a unique sensitivity to instrumental artifacts, a detailed characterization of detector noise and data quality is essential to eliminate spurious signals of terrestrial origin found by the searches.LIGO, Virgo, GEO 600 and KAGRA have dedicated teams responsible for detector characterization and data quality, as described in Davis et  CBC analyses (Abbott et al. 2021a,b,c) seek signals from merging neutron stars and black holes by filtering the data with waveform templates.BURST analyses (Abbott et al. 2021e,f) search for generic GW transients with minimal assumptions on the source or signal morphology by identifying excess power in the time-frequency representation of the GW strain data.CW searches (Abbott et al. 2022b) look for long-duration, continuous, periodic GW signals from asymmetries of rapidly spinning neutron stars.STOCH searches (Abbott et al. 2021g,h) target the stochastic GW background signal which is formed by the superposition of and unresolved sources from various stages of the evolution of the universe.
Because of fundamental differences in the search methodologies, certain noise types are relevant to specific searches.CBC and BURST searches look for short, transient signals, with durations from less than a second to several tens of seconds.Data quality information for these searches is recorded as sets of time intervals when data are relatively free of corruption, known as segment lists.This information is provided inside the GWOSC files for the two GW transient searches CBC and BURST.The data quality information most relevant for CW and STOCH searches is in the frequency domain and it is provided as lists of instrumental lines in separate files, available for download on GWOSC 13 .
Data quality and signal injection information for a given GPS second is indicated by bitmasks with a 1-Hz sampling rate.The bit meanings are given in Tables 3 and 4 for the data quality and injections, respectively.To describe data quality, different categories are defined.For each category, the corresponding bit in the bitmask shown in Table 3 has a value of 1 (good data) if in that second of time the requirements of the category are fulfilled, otherwise 0 (bad data).The meaning of each category is described in Davis et al. (2021) and Acernese et al. (2022a).Here, we provide a brief summary of each category: DATA: Failing this level indicates that strain data are not publicly available at this time because the instruments were not operating in nominal conditions.For O3, this is equivalent to failing Category 1 criteria, defined below.For intervals of bad or absent data, NaNs have been inserted in the corresponding strain data array.
CAT1: (Category 1) Failing a data quality check at this category indicates a critical issue with a key detector component not operating in its nominal configuration.GWOSC data during times that fail CAT1 criteria are replaced by NaN values in the strain time series.For O3, CBC_CAT1, BURST_CAT1, and DATA lead to identical segment lists.
CAT2: (Category 2) Failing a data quality check at this category indicates times when excess noise is present in a sensor with an understood physical coupling to the strain channel (LIGO Scientific Collaboration and Virgo Collaboration 2016).The fraction of time removed by this category is less than 1% of the data, and is detailed in Table 6.
CAT3: (Category 3) Failing a data quality check at this category indicates times when there is statistical coupling between a sensor/auxiliary channel and the strain channel which is not fully understood.This category was not used in O3 LVK searches, although it was used in previous observing runs (Abbott et al. 2021d).Data quality categories are cascading: a time which fails a given category automatically fails all higher categories.Since CAT3 flags were not used in O3, the CAT3 segment lists are identical to the corresponding CAT2 lists.However, the different analysis groups qualify the data independently: failing BURST_CAT2 does not necessarily imply failing CBC_CAT2.See Table 5 for the amount of time associated with each category.
Simulated signals added to the detectors for testing and calibration are referred to as hardware injections.GWOSC data releases provide a time series with each one second sample representing a bit mask vector of the state of the injection at that time.The injections are categorized according to the type of injected signal relevant to each astrophysical search.There are also injections used for detector characterization (DETCHAR).The injection bitmask marks the injection-free times.The bit corresponding to a given type of injection is defined in Table 4.A bit is set to 1 if there is no injection, otherwise it is set to 0. The full details of the complete set of hardware injections for O3 can be found at https://gwosc.org/O3/o3_inj.analyses used different versions of the strain channels.The alternate strain channel release was designed to reflect the internal formatting used by the LVK as much as possible.In particular, the release uses only the GWF file format, does not include any NaN values, and does not include any data quality information.The channels found in the alternate calibration release are described in Table 7.

ONLINE EVENT CATALOGS
Ninety-three GW transient events or notable candidates were discovered based on the LVK's analyses of the O3 data (Abbott et al. 2021a,b,c).Data associated with these signals are available online through the GWOSC Event Portal19 , along with other scientific products.For all events in the Event Portal, snippets of strain data are released in the form of a segment of 4096 seconds around the time of the event.The data snippets are made available no later than when the event discovery becomes public in a refereed, scientific journal.In addition, the Event Portal includes a concise summary of the source properties (i.e., parameters of the compact star binaries associated with each of the detected signals), links to a number of science products (posterior samples), links to any associated low-latency alerts, and a documentation page for each release containing publication information.The list of O3 event data releases is as follows: O3_Discovery_Papers: Notable events first published individually (Abbott et al. 2020b.Associated data releases may contain preliminary versions of data quality segments and calibration O3_IMBH_marginal: Marginal candidates associated with the search for Intermediate Mass Black Hole (IMBH) binary mergers GWTC-2: Confident events from the O3a observation run (first search) GWTC-2.1-confident:Confident events from the O3a observation run (updated search) GWTC-2.1-marginal:Marginal candidates from the O3a observation run (updated search) GWTC-2.1-auxiliary:Candidates from GWTC-2 which, based on the updated analysis presented in the GWTC-2.1 catalog paper, do not satisfy the criteria for inclusion in the GWTC-2.1-confidentor GWTC-2.1-marginalreleases GWTC-3-confident: Confident events from the O3b observing run GWTC-3-marginal: Marginal candidates from the O3b observing run Some events are listed in the database with multiple versions, typically corresponding to the event's inclusion in multiple releases.The cumulative GWTC catalog includes all confident GW events published by the LVK collaboration, and currently includes 93 events.Events in the GWTC-2.1-confidentand GWTC-3-confident releases all have a probability of astrophysical origin greater than 0.520 in at least one of the search pipelines, and are included in the cumulative GWTC.
The online catalogs are searchable via a web user interface.The Event Portal database can be queried based on specific source properties, namely the primary mass, secondary mass, total mass, chirp mass, final mass (of the merger remnant), luminosity distance, redshift, effective inspiral spin, or other properties associated with the observed signal, such as UTC or GPS event time, detector frame chirp mass, network SNR, false alarm rate and the posterior probability of astrophysical origin.The events can also be selected by identification such as partial event name, release catalog or group of catalogs.The output format can be one of the following: HTML, JSON, CSV or plain ASCII text.
To ease the analysis of multiple events, the catalogs can be queried programmatically with scripts using the REST API that returns all catalog lists in a JSON format.Catalogs can be queried with a GET request.As an example, to request all merger events for which the primary mass is less than 3M , the URL for the GET request would be https://gwosc.org/eventapi/html/query/show?max-mass-1-source=3.A detailed explanation of the query API nodes can be found on the GWOSC website21 .

Parameter estimation
For each detected source the Event Portal displays the 90% credible intervals for a selection of parameters that reflect the values given in the relevant publication.Those credible intervals are computed from the posterior samples resulting from Bayesian inference algorithms applied to the data.
In addition to the information provided by GWOSC, the posterior samples are distributed through the Zenodo open repository (LIGO Scientific Collaboration, Virgo Collaboration and KAGRA Collaboration 2021;LIGO Scientific Collaboration and Virgo Collaboration 2021c).
They are provided from the single event web page and through the JSON API as downloadable links to the files on Zenodo.The parameter names follow a standard nomenclature 22 (Hoy & Raymond 2021).Parameter estimates may change with different version of the event or catalog release.The parameter sets are denoted by a set of version numbers for each event (depending on the number of releases in which that event appears).

Low-latency alerts
During O3, public alerts were communicated with low latency to report the occurence of a notable trigger detected in the data 23 .The alerts are sent with a latency of few minutes after detection.They include a number of preliminary parameter estimations that are useful for the localization of the source through a probability skymap.This information can be used by other, non-GW, instruments to search for potential electromagnetic counterparts in follow-up observations.The complete list of alerts sent during O3 can be found publicly in the GraceDB website 24 and, as described below, in GWOSC.
The Event Portal references the GraceDB entry for the original trigger alert of the event.Links to GraceDB entries are available through the GWOSC web interface and the JSON API.Events first detected offline do not trigger low-latency alerts and thus lack a GraceDB entry.

TECHNICAL VALIDATION
The O3 GWOSC data release is reprocessed for the broader user community beginning with the internal strain data products used for data analysis by the LIGO, Virgo, and KAGRA Collaborations for publication purposes.The reprocessing produces new GWOSC gwf and hdf5 files containing the previously discussed strain, data quality and hardware injection information for each detector.In addition, versions of these GWOSC files at a reduced sampling rate of 4096 Hz for the strain channel of each detector are also produced.All data for the release are carefully reviewed by the internal GWOSC team and then reviewed by an independent review team made up of members from the LIGO, Virgo, and KAGRA Collaborations.This review process checks that: • the strain vectors at the maximum sample rate (16 kHz) in the GWOSC hdf and gwf files are identical to machine precision to the corresponding strain vectors of the LVK main archives; • the strain vectors after resampling at 4 kHz do not have numerical artifacts that may arise from the resampling technique; • the data quality and injection information located in either the GWOSC hdf and gwf files or the online Timeline tool described in detail in Section 6, agree with all available records.
• the documentation associated with the O3 data products found online is correct and contains comprehensive information for the broader user community.
The data files and accompanying documentation are released to the public on the GWOSC website once all checks have passed at the designated date and time agreed to by the LIGO, Virgo and KAGRA Collaborations.
6. USAGE NOTES 6.1.Salient features of GW data 22 See https://lscsoft.docs.ligo.org/pesummary/unstable_docs/gw/parameters.html. 23See https://emfollow.docs.ligo.org/userguidefor more details.This userguide is a living document that is being updated in preparation for the upcoming science run O4.Therefore, the informations in this guide may not be necessarily relevant for O3 data. 24https://gracedb.ligo.org/superevents/public/O3 Working with GW data requires an awareness of the presence of noise in the data.An overview of LIGO/Virgo detector noise and some applicable signal processing methods are described in Abbott et al. (2020c); see also above in Sec.3.2 and 2.3 for a brief introduction to various classes of detector noise.In addition, as mentioned previously, the data are only valid within a fixed frequency range due to the limits of calibration (Sec.2.2) as well as due to artifacts from the down-sampling process (Sec.3).All of these complications need to be considered when searching for astrophysical signals.

List of observing segments
Segment lists describe times when GW detectors are collecting data and are operating in a normal condition, as described in Section 3.2.The GWOSC website provides an online app called Timeline to discover, plot, and download segment lists 25 .The Timeline query page allows users to select observing runs from a drop-down menu, and then view the names of segment lists associated with the selected run.Segment lists may be downloaded as ASCII text files or in a JSON format.Alternatively, segment lists may be displayed in an interactive plot, as seen in Fig. 3. To explore times within a run, a visitor can use the mouse to scroll and zoom on the Timeline plots.Hovering the mouse over a segment displays a tool-tip with the exact start and stop time, in both GPS and UTC time.

Software and Support
The GWOSC website provides a number of resources for helping investigators learn to work with GW data, including: • Software libraries 26 : A number of software packages developed for GW analysis are open source.The GWOSC website provides a suggested list of packages, many of which were created by members of the LIGO, Virgo, and KAGRA collaborations.Links to source code and documentation are provided for each package.
• Tutorials 27 : GWOSC provides tutorials to demonstrate the basics of GW data analysis.Most tutorials are in Python, and provided in notebooks that can be run in the cloud to avoid the necessity for the user to install software.
• Workshops and online course 28 : Annual Open Data Workshops provide a complete course in working with GW data, including lectures, software tutorials, and challenge problems.Materials from past workshops are available as a free online course; students can enroll at any time.Future workshops will be posted on the GWOSC website, and are open to any interested participants.
• Discussion forum29 : A public discussion forum for GW topics provides space to ask for help with GW data analysis, discuss LVK papers, post questions about GW science, and connect with other researchers in the field.

SUMMARY
The O3 data set described in this paper represents the most sensitive gravitational-wave observations to date.The data contain over 80 compact object merger signals, as described in a number of catalog releases, including GWTC-2 (Abbott et al. 2021a), GWTC-2.1 (Abbott et al. 2021b) and GWTC-3 (Abbott et al. 2021c).O3 includes three main phases: O3a, O3b, and O3GK.O3a and O3b are both joint runs of LIGO and Virgo, while the O3GK run involved KAGRA and GEO 600.Data and documentation for all O3 data are available from the GWOSC website.
Looking ahead, LIGO, Virgo, and KAGRA are planning an O4 run, scheduled to begin in 2023, with improved sensitivity.Data from events discovered in O4 will be released as the events are published, and release of the next large strain data sets are planned for 2025 (LIGO Laboratory 2022b).This will be followed by the O5 observing run, anticipated to be the first extended observing run with a span of over two years (LIGO Scientific Collaboration, Virgo Collaboration and KAGRA Collaboration 2022b).Planned instrument upgrades should increase the sensitivity of the network and thus extend the volume of space over which signals may be observed, so that future data sets will include more frequent detections and a corresponding expanded depth of science in this rapidly evolving field.

Figure 3 .
Figure3.Timelines for the full O3a (top) and O3b (middle) and O3GK (bottom) observing runs based on the data quality bitmask CBC_CAT1 for each detector (see Tab. 3).Colored bars represent times when data are available, and white areas show times when data are not available.Similar plots can be generated from the GWOSC web pages25 .

Table 1 .
The channel names and frame types listed in this table are unique identifiers in the LIGO, Virgo, GEO 600 and KAGRA data archives that allow tracing the provenance of the strain data released on GWOSC.H1 and L1 indicate the two LIGO detectors (Hanford and Livingston respectively), V1 refers to Virgo, G1 refers to GEO 600 and K1 refers to KAGRA.The attribute CLEAN-SUB60HZ in H1 and L1 indicates that the noise subtraction procedure described inVajente et al. (2020b)was used.The attributes C01, V1Online and V1O3Repro1A refer to the calibration version.

Table 3 .
Data quality bitmasks description.For O3, the CBC_CAT1 and BURST_CAT1 segment lists are equivalent (see the definition of CAT1 in the text).Note that any data that are not present are replaced by NaN values in the corresponding strain time series.In each bit mask, a value of 1 corresponds to the data quality check passing (good data), and a zero means the check has failed (bad data).The CBC_CAT3 and BURST_CAT3 are equivalent to CBC_CAT2 and BURST_CAT2 in O3.

Table 4 .
Meaning of the injection bits.A value of 1 indicates TRUE (no injection), while a value of 0 is FALSE (injection is present).

Table 5 .
Total time satisfying the data quality criteria for each SEARCH type (= CBC or BURST) and each CATEGORY (= CAT1, CAT2 or CAT3) spanning the full DURATION of each observing RUN (= O3a, O3b or O3GK) and each DETECTOR (= H1, L1, V1, G1 or K1).DURATION includes all time in seconds between the official start and end of each RUN, including times when the instruments are not collecting data for astrophysical analysis.When the criteria for a given flag is satisfied, the corresponding bit will have the value 1 (good data by these criteria); otherwise, it will have the value 0 (bad data).The data in the table can be retrieved at https://gwosc.org/timeline/show/[RUN]_16KHZ_R1/[DETECTOR]_[SEARCH]_[CATEGORY].

Table 6 .
Fraction of observing time removed by applying CAT2 vetoes.The percentages represent the amount of time in the DATA segment list relative to the total duration of observing time.CAT2 vetoes were not used for Virgo, KAGRA, or GEO 600.