Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 15:2021:baab028.
doi: 10.1093/database/baab028.

Challenges for FAIR-compliant description and comparison of crop phenotype data with standardized controlled vocabularies

Affiliations

Challenges for FAIR-compliant description and comparison of crop phenotype data with standardized controlled vocabularies

Liliana Andrés-Hernández et al. Database (Oxford). .

Abstract

Crop phenotypic data underpin many pre-breeding efforts to characterize variation within germplasm collections. Although there has been an increase in the global capacity for accumulating and comparing such data, a lack of consistency in the systematic description of metadata often limits integration and sharing. We therefore aimed to understand some of the challenges facing findable, accesible, interoperable and reusable (FAIR) curation and annotation of phenotypic data from minor and underutilized crops. We used bambara groundnut (Vigna subterranea) as an exemplar underutilized crop to assess the ability of the Crop Ontology system to facilitate curation of trait datasets, so that they are accessible for comparative analysis. This involved generating a controlled vocabulary Trait Dictionary of 134 terms. Systematic quantification of syntactic and semantic cohesiveness of the full set of 28 crop-specific COs identified inconsistencies between trait descriptor names, a relative lack of cross-referencing to other ontologies and a flat ontological structure for classifying traits. We also evaluated the Minimal Information About a Phenotyping Experiment and FAIR compliance of bambara trait datasets curated within the CropStoreDB schema. We discuss specifications for a more systematic and generic approach to trait controlled vocabularies, which would benefit from representation of terms that adhere to Open Biological and Biomedical Ontologies principles. In particular, we focus on the benefits of reuse of existing definitions within pre- and post-composed axioms from other domains in order to facilitate the curation and comparison of datasets from a wider range of crops. Database URL: https://www.cropstoredb.org/cs_bambara.html.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Histogram for the counts of trait names within the 28 Trait Dictionaries (TDs). The histogram represents 3627 trait names within the TDs, along with the number of trait names across the TD for the 28 crop species. The gap in the data representing trait names that are repeated one or two times across the TDs was not plotted in the histogram; for more information, refer Supplementary Table S2 table.
Figure 2.
Figure 2.
Similarity heatmap for the shared trait names across 28 Trait Dictionaries (TDs) in the Crop Ontology. Values were calculated using the ‘simple matching coefficient’, colour gradient shading is relative to the pairwise percentage of trait names shared across the 28 TDs, with red indicating high values and green zero (Supplementary Table S3).
Figure 3.
Figure 3.
Venn diagram for the count of 230 trait names in bambara groundnut unique and shared across the different institutions. The numbers show the number of unique and shared trait names across the different institutions. Abbreviations in the sets are as follows: University of Nottingham (UoN), Integrated Breeding Platform (IBP), International Plant Genetic Resources Institute (IPGRI) and International Institute of Tropical Agriculture (IITA).
Figure 4.
Figure 4.
Example of granularity improvement for the CO for the ‘Flag leaf area’ term. Blue arrows represent the ‘is_a’ relationship. Abbreviations are related to existent ontologies: Crop Ontology (CO), Plant Trait Ontology (TO), Plant Ontology (PO), Phenotype and Trait Ontology (PATO) and Basic Formal Ontology (BFO).

References

    1. Andrés-Hernández,L., Baten,A., Azman Halimi,R.. et al. (2020) Knowledge representation and data sharing to unlock crop variation for nutritional food security. Knowledge representation and data sharing to unlock crop variation for nutritional food security. Crop Sci., 60, 516–529.
    1. Harper,L., Campbell,J., Cannon,E.K.S.. et al. (2018) AgBioData Consortium Recommendations for Sustainable Genomics and Genetics Databases for Agriculture. Database, Vol. 2018. - PMC - PubMed
    1. Ćwiek-Kupczyńska,H., Altmann,T., Arend,D.. et al. (2016) Measures for interoperability of phenotypic data: minimum information requirements and formatting. Plant Methods, 12, 44. - PMC - PubMed
    1. Selby,P., Abbeloos,R., Backlund,J.E.. et al. (2019) BrAPI—an application programming interface for plant breeding applications. Bioinformatics, 35, 4147–4155. - PMC - PubMed
    1. Jonquet,C., Toulet,A., Arnaud,E.. et al. (2018) AgroPortal: a vocabulary and ontology repository for agronomy. Comput. Electron. Agric., 144, 126–143.

Publication types

LinkOut - more resources