Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun 2;23(1):210.
doi: 10.1186/s12859-022-04751-6.

CoQUAD: a COVID-19 question answering dataset system, facilitating research, benchmarking, and practice

Affiliations

CoQUAD: a COVID-19 question answering dataset system, facilitating research, benchmarking, and practice

Shaina Raza et al. BMC Bioinformatics. .

Abstract

Background: Due to the growing amount of COVID-19 research literature, medical experts, clinical scientists, and researchers frequently struggle to stay up to date on the most recent findings. There is a pressing need to assist researchers and practitioners in mining and responding to COVID-19-related questions on time.

Methods: This paper introduces CoQUAD, a question-answering system that can extract answers related to COVID-19 questions in an efficient manner. There are two datasets provided in this work: a reference-standard dataset built using the CORD-19 and LitCOVID initiatives, and a gold-standard dataset prepared by the experts from a public health domain. The CoQUAD has a Retriever component trained on the BM25 algorithm that searches the reference-standard dataset for relevant documents based on a question related to COVID-19. CoQUAD also has a Reader component that consists of a Transformer-based model, namely MPNet, which is used to read the paragraphs and find the answers related to a question from the retrieved documents. In comparison to previous works, the proposed CoQUAD system can answer questions related to early, mid, and post-COVID-19 topics.

Results: Extensive experiments on CoQUAD Retriever and Reader modules show that CoQUAD can provide effective and relevant answers to any COVID-19-related questions posed in natural language, with a higher level of accuracy. When compared to state-of-the-art baselines, CoQUAD outperforms the previous models, achieving an exact match ratio score of 77.50% and an F1 score of 77.10%.

Conclusion: CoQUAD is a question-answering system that mines COVID-19 literature using natural language processing techniques to help the research community find the most recent findings and answer any related questions.

Keywords: CORD-19; COVID-19; LitCOVID; Long-COVID; Pipeline; Post-COVID-19; Question answering system; Transformer model.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Construction of reference-standard dataset
Fig. 2
Fig. 2
Example of an extractive QA system composed of a question, context and answer
Fig. 3
Fig. 3
CoQUAD system architecture
Fig. 4
Fig. 4
CoQUAD-MPNet
Fig. 5
Fig. 5
SAS scores of Reader during different values of top @k
Fig. 6
Fig. 6
EM score of all models
Fig. 7
Fig. 7
F1-score of all models

Similar articles

Cited by

References

    1. Yuki K, Fujiogi M, Koutsogiannaki S. COVID-19 pathophysiology: a review. Clin Immunol. 2020;215:108427. doi: 10.1016/j.clim.2020.108427. - DOI - PMC - PubMed
    1. Ksiazek TG, Erdman D, Goldsmith CS, Zaki SR, Peret T, Emery S, et al. A novel coronavirus associated with severe acute respiratory syndrome. N Engl J Med. 2003;348(20):1953–1966. doi: 10.1056/NEJMoa030781. - DOI - PubMed
    1. World Health Organization. Archived: WHO Timeline—COVID-19 [Internet]. Wold Health Organization. 2020 [cited 2021 Oct 7]. p. 2020. Available from: https://www.who.int/news/item/27-04-2020-who-timeline---covid-19
    1. Rajkumar RP. COVID-19 and mental health: a review of the existing literature. Asian J Psychiatr. 2020;52:102066. doi: 10.1016/j.ajp.2020.102066. - DOI - PMC - PubMed
    1. Lopez-Leon S, Wegman-Ostrosky T, Perelman C, Sepulveda R, Rebolledo PA, Cuapio A, et al. More than 50 long-term effects of COVID-19: a systematic review and meta-analysis. Res Sq. 2021;32:1613. doi: 10.1101/2021.01.27.21250617. - DOI - PMC - PubMed