Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul 27:9:e58496.
doi: 10.7554/eLife.58496.

International authorship and collaboration across bioRxiv preprints

Affiliations

International authorship and collaboration across bioRxiv preprints

Richard J Abdill et al. Elife. .

Abstract

Preprints are becoming well established in the life sciences, but relatively little is known about the demographics of the researchers who post preprints and those who do not, or about the collaborations between preprint authors. Here, based on an analysis of 67,885 preprints posted on bioRxiv, we find that some countries, notably the United States and the United Kingdom, are overrepresented on bioRxiv relative to their overall scientific output, while other countries (including China, Russia, and Turkey) show lower levels of bioRxiv adoption. We also describe a set of 'contributor countries' (including Uganda, Croatia and Thailand): researchers from these countries appear almost exclusively as non-senior authors on international collaborations. Lastly, we find multiple journals that publish a disproportionate number of preprints from some countries, a dynamic that almost always benefits manuscripts from the US.

Keywords: bibliometrics; bioRxiv; computational biology; meta-research; none; preprints; scientific publishing; systems biology.

PubMed Disclaimer

Conflict of interest statement

RA Has been a volunteer ambassador for ASAPbio, an open-science advocacy organization that is also affiliated with Review Commons. EA, RB No competing interests declared

Figures

Figure 1.
Figure 1.. Preprints per country.
(a) A heat map indicating the number of preprints per country, based on the institutional affiliation of the senior author. The color coding uses a log scale. (b) The total preprints attributed to the seven most prolific countries. The x-axis indicates total preprints listing a senior author from a country; the y-axis indicates the country. The ‘Other’ category includes preprints from all countries not listed in the plot. (c) Similar to panel b, but showing the total preprints listing at least one author from the country in any position, not just the senior position. (d) Proportion of total senior-author preprints from each country (y-axis) over time (x-axis), starting in November 2013 and continuing through December 2019. Each colored segment indicates the proportion of total preprints attributed to a single country (using same color scheme as panels (b and c), as of the end of the month indicated on the x-axis.
Figure 1—figure supplement 1.
Figure 1—figure supplement 1.. Preprint-level collaboration.
(a) shows the average number of authors per paper over time. The x-axis indicates the year; the y-axis indicates the harmonic mean authors per preprint. Each point indicates the average of papers posted in a single month; the blue line indicates the six-month moving average. (b) illustrates the number of countries per preprint, over time. The x-axis indicates time; the y-axis indicates the arithmetic mean countries per preprint. Each point indicates the average unique countries found in all preprints posted in a single month. The blue line indicates the six-month moving average.
Figure 1—figure supplement 2.
Figure 1—figure supplement 2.. Preprints with no country assignment.
This bar plot compares the observed prevalence of preprints from countries split in two groups: the 27 most prolific countries, and the remaining 148 countries for which at least one bioRxiv author was observed. The red bars indicate the proportion of preprints from countries in each group, out of all preprints with a country assignment. The blue bars indicate the proportion of preprints from countries in each group, out of a random sample of 325 preprints with no country assignment. The error bars indicate the margin of error at a 95% confidence interval.
Figure 2.
Figure 2.. BioRxiv adoption per country.
(a) Correlation between two scientific output metrics. Each point is a country; the x-axis (log scale) indicates the total citable documents attributed to that country from 2014 to 2019, and the y-axis (also log scale) indicates total senior-author preprints attributed to that country overall. The red line demarcates a ‘bioRxiv adoption’ score of 1.0, which indicates that a country’s share of bioRxiv preprints is identical to its share of general scholarly outputs. Countries to the left of this line have a bioRxiv adoption score greater than 1.0. A score of 2.0 would indicate that its share of preprints is twice as high as its share of other scholarly outputs (See Discussion for more about this measurement.) (b) The countries with the 10 highest and 10 lowest bioRxiv adoption scores. The x-axis indicates each country’s adoption score, and the y-axis lists each country in order. All panels include only countries with at least 50 preprints.
Figure 3.
Figure 3.. Contributor countries.
(a) Bar plot indicating the international senior author rate (y-axis) by country (x-axis) – that is, of all international preprints with a contributor from that country, the percentage of them that include a senior author from that country. All 17 contributor countries are listed in red, with the five countries with the highest senior-author rates (in grey) for comparison. (b) A bar plot with the same y-axis as panel (a). The x-axis indicates the international collaboration rate, or the proportion of preprints with a contributor from that country that also include at least one author from another country. (c) is a bar plot indicating the total international preprints featuring at least one author from that country (the median value per country is 19). (d) On the left are the 17 contributor countries. On the right are the countries that appear in the senior author position of preprints that were co-authored with contributor countries. (Supervising countries with 25 or fewer preprints with contributor countries were excluded from the figure.) The width of the ribbons connecting contributor countries to senior-author countries indicates the number of preprints supervised by the senior-author country that included at least one author from the contributor country. Statistically significant links were found between four combinations of supervising countries and contributors: Australia and Bangladesh (Fisher’s exact test, q = 1.01 × 10−11); the UK and Thailand (q = 9.54 × 10−4); the UK and Greece (q = 6.85 × 10−3); and Australia and Vietnam (q = 0.049). All p-values reflect multiple-test correction using the Benjamini–Hochberg procedure.
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. Map of contributor countries.
World map indicating (in red) the location of contributor countries, defined as all countries listed on at least 50 international preprints, but as senior author on less than 20% of them.
Figure 3—figure supplement 2.
Figure 3—figure supplement 2.. International collaboration correlations.
Each point represents a country; the red points indicate those in the ‘contributor country’ category. Blue lines indicate lines of best fit for each plot, though they are unrelated to the Spearman correlations reported for these relationships. (a) A scatter plot showing the relationship (Spearman’s ρ = 0.781, p=1.09 × 10−14) between a country’s total international preprints (x-axis; log scale) and the proportion of those preprints for which they are the senior author (y-axis). (b) A scatter plot showing the relationship (Spearman’s ρ = −0.578, p=3.68 × 10−7) between a country’s total international preprints (x-axis; log scale) and the proportion of preprints with a contributor from that country that also include at least one contributor from another country (y-axis). (c) A scatter plot showing the relationship (Spearman’s ρ = −0.572, p=5.32 × 10−7) between the proportion of preprints with a contributor from that country that also include at least one contributor from another country (x-axis) and the proportion of those preprints for which that country appears in the senior author position (y-axis).
Figure 3—figure supplement 3.
Figure 3—figure supplement 3.. Correlation between three measurements of international collaboration.
This figure is an alternative presentation of the same data as the three panels in Figure 3—figure supplement 2. Each point represents a country, and the size of the point indicates the total international preprints associated with that country. The x-axis indicates the proportion of preprints with a contributor from that country that also include at least one contributor from another country. The y-axis indicates the proportion of those preprints for which that country appears in the senior author position.
Figure 4.
Figure 4.. Preprint outcomes.
All panels include countries with at least 100 senior-author preprints. (a) A box plot indicating the number of downloads per preprint for each country. The dark line in the middle of the box indicates the median, and the ends of each box indicate the first and third quartiles, respectively. ‘Whiskers’ and outliers were omitted from this plot for clarity. The red line indicates the overall median. (b) A plot showing the relationship (Spearman’s ρ = 0.485, p=0.00274) between total preprints and downloads. Each point represents a single country. The x-axis indicates the total number of senior-author preprints attributed to the country. The y-axis indicates the median number of downloads for those preprints. (c) A plot showing the relationship (Spearman’s ρ = 0.777, p=2.442 × 10−8) between downloads and publication rate. Each point represents a single country. The x-axis indicates the median number of downloads for all preprints listing a senior author affiliated with that country. The y-axis indicates the proportion of preprints posted before 2019 that have been published. (d) A bar plot indicating the proportion of preprints posted before 2019 that are now flagged as ‘published’ on the bioRxiv website. The x-axis (and color scale) indicates the proportion, and the y-axis lists each country. The red line indicates the overall publication rate.
Figure 5.
Figure 5.. Overrepresentation of US preprints.
(a) A heat map indicating all disproportionately strong (q < 0.05) links between countries and journals, for journals that have published at least 15 preprints from that country. Columns each represent a single country, and rows each represent a single journal. Colors indicate the raw number of preprints published, and the size of each square indicates the statistical significance of that link—larger squares represent smaller q-values. See Figure 5—source data 1 for the results of each statistical test. (b) A bar plot indicating the degree to which US preprints are over- or under-represented in a journal’s published bioRxiv preprints. The y-axis lists all the journals that published at least 15 preprints with a US senior author. The x-axis indicates the overrepresentation of US preprints compared to the expected number: for example, a value of ‘0%’ would indicate the journal published the same proportion of US preprints as all journals combined. A value of ‘100%’ would indicate the journal published twice as many U. preprints as expected, based on the overall representation of the US among published preprints. Journals for which the difference in representation was less than 15% in either direction are not displayed. The red bars indicate which of these relationships were significant using the Benjamini–Hochberg-adjusted results from χ² tests shown in panel A.

References

    1. Abdill RJ. rxivist spider. 17d1956GitHub. 2020 https://github.com/blekhmanlab/biorxiv_countries
    1. Abdill RJ, Blekhman R. Complete rxivist dataset of scraped bioRxiv data. 2020-07-17Zenodo. 2019a doi: 10.5281/zenodo.2529922. - DOI
    1. Abdill RJ, Blekhman R. Tracking the popularity and outcomes of all bioRxiv preprints. eLife. 2019b;8:e45133. doi: 10.7554/eLife.45133. - DOI - PMC - PubMed
    1. Abdill RJ, Blekhman R. Rxivist.org: sorting biology preprints using social media and readership metrics. PLOS Biology. 2019c;17:e3000269. doi: 10.1371/journal.pbio.3000269. - DOI - PMC - PubMed
    1. Adams JD, Black GC, Clemmons JR, Stephan PE. Scientific teams and institutional collaborations: Evidence from US universities, 1981–1999. Research Policy. 2005;34:259–285. doi: 10.1016/j.respol.2005.01.014. - DOI

Publication types