This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy. Close this notification
Skip to content

Short Author List Codes

, , and

Published July 2019 © 2019. The American Astronomical Society. All rights reserved.
, , Citation Alice Allen et al 2019 Res. Notes AAS 3 102 DOI 10.3847/2515-5172/ab3396

2515-5172/3/7/102

Export citation and abstract BibTeX RIS

The literature contains articles on community-developed codes such as AstroPy (Astropy Collaboration et al. 2013, 2018) and yt4 (Turk et al. 2011), their importance in research, and the need for supporting these important codes and those who devote time and effort building and sustaining these resources (Turk 2013; Muna et al. 2016), but has comparatively little about the body of software written by 1, 2, or 3 people, what we here refer to as "short author list" (SAL) codes. A recent examination of the 1978 entries currently in the Astrophysics Source Code Library (ASCL,ascl.net) (Nemiroff & Wallin 1999) reveals that 1348 of them, 68%, are SAL codes.

1. Methods

To determine how many ASCL entries are SAL codes, we pulled ASCL data into a spreadsheet and counted the commas in the author field to determine the "comma count." This gives a good indication of the number of authors, as author names are stored by last name, first name with semicolons between names. Entries for codes with collaborations, workgroups, institutions, and other similar entities credited with authorship (our category "Teams") typically do not have commas in them; the few that do were rescored to a comma count of 0. After sorting by comma count, we examined the data, removed four unpublished codes, and rescored several entries that indicated collaboration authorship through use of text such as collaboration, group, others and similar language in the author field; we also cleaned author lists that had extra or missing commas and rescored them. We then resorted the table and used the subtotal function to count the number of entries for each comma count category.

We examined consolidated citation statistics for all published ASCL entries. This includes citations to (1) ASCL ID, (2) bibcodes in the "Described in" field of the ASCL entry, which is often used for citing code, and, (3) bibcodes that appear in the "Preferred citation" field, where different from 1 or 2, for codes in the various categories. We collected these data through Astrophysics Data System (ADS)5 queries on several codes mentioned by name in this Note, and through the ADS API by modifying scripts and SQL queries that gather the ASCL dashboard6 citation statistics to include information on citations to bibcodes listed in the ASCL "Described in" and "Preferred citation" fields. We adjusted over-counted consolidated citations to four codes.

2. Findings

SAL codes represent the work of over 5000 individuals; 51% of these codes have one author and 35% of all codes in the ASCL are credited to a sole author (Figure 1). 66% of all citations are to SAL software. SAL codes have 126,379 consolidated citations, a mean of 110 citations, and a median of 22 citations.

Figure 1.

Figure 1. Codes in the ASCL with 1, 2, or 3 attributed authors—short author list (SAL) codes—predominate. Though community-, team-, and institutionally authored software may be widely used, these are comparatively few in relation to SAL codes.

Standard image High-resolution image

Codes attributed to Teams have 14,271 consolidated citations, with a mean of 207 citations per code and a median of 27. Team codes have a higher proportion of citations; they are 4% of all codes, yet have 7% of all citations. The total of consolidated citations to all codes in the ASCL is 192,807.

3. Discussion

Single-author software makes up a substantial percentage of ASCL entries, as many researchers write code for their own work. Though some SAL codes may never be used after initial development for a specific project, others, such as ZEUS-2D (Stone & Norman 1992), MIRIAD (Sault et al. 1995), GADGET-2 (Springel et al. 2001; Springel 2005), and Source Extractor (Bertin & Arnouts 1996) are used by many others, as their consolidated citations demonstrate. As of this writing, ADS shows ZEUS-2D with 1233 consolidated citations, MIRIAD with 1473, GADGET-2 with 5041, and Source Extractor with 6659; recent citations demonstrate these codes are in active use. It is not surprising that 66% of all consolidated citations are to SAL codes, since this software is 68% of all codes.

Team codes have a much higher mean number of citations and a higher median; these codes are likely to be used by the members of the teams that develop them and/or are widely used community codes. Consolidated citations follow a power-law distribution, which explains means being so much higher than medians.

Programmatic citation consolidation has some drawbacks, as citations could be over-counted or ascribed to two codes. For example, for a record containing a bibcode for a published paper in the "Described in" field and the bibcode for the pre-print in the "Preferred citation" field, consolidated citations will be doubled. This can be found and corrected, as we did. Harder are cases where software is built on or with other codes and the "Preferred citation" includes the citation for the other code(s). We corrected one instance of this error, but have not yet devised a way to easily discover such cases. Also, citation to a paper, even one describing code, may not be for software use but rather for other science in the paper or discussion of the software; we offer the numbers above with these caveats. Data and scripts are available at https://github.com/pwry/salc.

This research made use of NASA's Astrophysics Data System. We thank all who write and release research software.

Footnotes

Please wait… references are loading.
10.3847/2515-5172/ab3396