Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Oct;161(2):139-147.
doi: 10.1111/imm.13234. Epub 2020 Jul 26.

A behind-the-scenes tour of the IEDB curation process: an optimized process empirically integrating automation and human curation efforts

Affiliations

A behind-the-scenes tour of the IEDB curation process: an optimized process empirically integrating automation and human curation efforts

Nima Salimi et al. Immunology. 2020 Oct.

Abstract

The Immune Epitope Database and Analysis Resource (IEDB) provides the scientific community with open access to epitope data, as well as epitope prediction and analysis tools. The IEDB houses the most extensive collection of experimentally validated B-cell and T-cell epitope data, sourced primarily from published literature by expert curation. The data procurement requires systematic identification, categorization, curation and quality-checking processes. Here, we provide insights into these processes, with particular focus on the dividends they have paid in terms of attaining project milestones, as well as how objective analyses of our processes have identified opportunities for process optimization. These experiences are shared as a case study of the benefits of process implementation and review in biomedical big data, as well as to encourage idea-sharing among players in this ever-growing space.

Keywords: B cell; T cell; curation; database; epitope.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
The IEDB curation process consists of four high‐level steps, each with its own specific sub‐steps designed to ensure data quality and consistency.
Figure 2
Figure 2
The assay‐centric, contextual representation of epitope data is rooted in an ontologically based curation system enabling the Curator to extract and input data from text and figures into defined data fields.
Figure 3
Figure 3
The distribution of effort required for the major steps in the curation workflow, based on compiled input from IEDB staff. This analysis was used to target steps for process optimization.

References

    1. Vita R, Mahajan S, Overton JA, Dhanda SK, Martini S, Cantrell JR et al The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res 2019; 47(D1):D339–D343. - PMC - PubMed
    1. Peters B, Sidney J, Bourne P, Bui HH, Buus S, Doh G et al The Immune Epitope Database and analysis resource: from vision to blueprint. PLoS Biol 2005; 3:e91. - PMC - PubMed
    1. Sayers EW, Beck J, Brister JR, Bolton EE, Canese K, Comeau DC et al Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2020; 48(D1):D9–D16. - PMC - PubMed
    1. Vita R, Overton JA, Mungall CJ, Sette A, Peters B. Fair principles and the IEDB: short‐term improvements and a long‐term vision of OBO‐foundry mediated machine‐actionable interoperability. Database (Oxford) 2018;20181–9. - PMC - PubMed
    1. Burley SK, Berman HM, Bhikadiya C, Bi C, Chen L, Di Costanzo L et al RCSB Protein Data Bank: biological macromolecular structures enabling research and education in fundamental biology, biomedicine, biotechnology and energy. Nucleic Acids Res 2019; 47(D1):D464–D474. - PMC - PubMed

Publication types

Substances