Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023;128(6):3649-3673.
doi: 10.1007/s11192-023-04703-8. Epub 2023 May 14.

A diachronic perspective on citation latency in Wikipedia articles on CRISPR/Cas-9: an exploratory case study

Affiliations

A diachronic perspective on citation latency in Wikipedia articles on CRISPR/Cas-9: an exploratory case study

Marion Schmidt et al. Scientometrics. 2023.

Abstract

This paper analyzes Wikipedia's representation of the Nobel Prize winning CRISPR/Cas9 technology, a method for gene editing. We propose and evaluate different heuristics to match publications from several publication corpora against Wikipedia's central article on CRISPR and against the complete Wikipedia revision history in order to retrieve further Wikipedia articles relevant to the topic and to analyze Wikipedia's referencing patterns. We explore to what extent the selection of referenced literature of Wikipedia's central article on CRISPR adheres to scientific standards and inner-scientific perspectives by assessing its overlap with (1) the Web of Science (WoS) database, (2) a WoS-based field-delineated corpus, (3) highly-cited publications within this corpus, and (4) publications referenced by field-specific reviews. We develop a diachronic perspective on citation latency and compare the delays with which publications are cited in relevant Wikipedia articles to the citation dynamics of these publications over time. Our results confirm that a combination of verbatim searches by title, DOI, and PMID is sufficient and cannot be improved significantly by more elaborate search heuristics. We show that Wikipedia references a substantial amount of publications that are recognized by experts and highly cited, but that Wikipedia also cites less visible literature, and, to a certain degree, even not strictly scientific literature. Delays in occurrence on Wikipedia compared to the publication years show (most pronounced in case of the central CRISPR article) a dependence on the dynamics of both the field and the editor's reaction to it in terms of activity.

Keywords: Bibliometrics; CRISPR; Publication matching; Relevance; Timeliness; Wikipedia.

PubMed Disclaimer

Conflict of interest statement

Competing interestsThe authors have no relevant financial or non-financial interests to disclose.

Figures

Fig. 1
Fig. 1
Overview of the research setting. From the pilot study (a), we learn the effectiveness of our algorithms for fuzzy and verbatim matching. With study (b), we identify the 10 most relevant CRISPR articles in Wikipedia besides the central article. Study (c) then forms the actual main analysis, where we quantify when and how the papers of the field corpus are considered in Wikipedia
Fig. 2
Fig. 2
Examples showcasing the verbatim and fuzzy heuristics; divergent data underlined
Fig. 3
Fig. 3
Articles in the Wikipedia dump from June 2021 for which more than five publications from the bibliometric field delineation of 20,585 publications match the DOI or PMID over their entire revision history. The articles are sorted from top to bottom by descending number of maximum matched publications. The bars indicate the maximum number of publications for each month of their revision history
Fig. 4
Fig. 4
Publication dates (x-axis) of matched publications in relation to the first occurrence on Wikipedia (y-axis) of the central CRISPR article
Fig. 5
Fig. 5
Smoothed dynamics of the growth of text of the CRISPR Wikipedia article and its references
Fig. 6
Fig. 6
Publication dates (x-axis) of matched publications in relation to the first occurrence (y-axis) of the selection of CRISPR-related articles
Fig. 7
Fig. 7
Delays (in days) of occurrences in Wikipedia articles in relation to other articles within the set
Fig. 8
Fig. 8
Matched publications’ yearly citation counts are plotted in relation to the date of first occurrence in the CRISPR article as baseline

References

    1. Redi, M., & Taraborelli, D. (2018). Accessibility and topics of citations with identifiers in Wikipedia. figshare. Dataset. 10.6084/m9.figshare.6819710.v1
    1. Arroyo-Machado W, Torres-Salinas D, Herrera-Viedma E, Romero-Frías E. Science through Wikipedia: A novel representation of open knowledge through co-citation networks. PLoS ONE. 2020;15(2):e0228713. doi: 10.1371/journal.pone.0228713. - DOI - PMC - PubMed
    1. Banasik-Jemielniak N, Jemielniak D, Wilamowski M. Psychology and Wikipedia: Measuring psychology journals’ impact by Wikipedia citations. Social Science Computer Review. 2021 doi: 10.1177/0894439321993836. - DOI
    1. Benjakob O, Aviram R. A clockwork Wikipedia: From a broad perspective to a case study. Journal of Biological Rhythms. 2018;33(3):233–244. doi: 10.1177/0748730418768120. - DOI - PubMed
    1. Casebourne, I., Davies, C., Fernandes, M., & Norman, N. (2012). Assessing the accuracy and quality of Wikipedia entries compared to popular online encyclopaedias: A comparative preliminary study across disciplines in English, Spanish and Arabic. Retrieved from http://commons.wikimedia.org/wiki/File:EPIC_Oxford_report.pdf

LinkOut - more resources