Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Dec 21;18(Suppl 17):557.
doi: 10.1186/s12859-017-1979-z.

Comparison, alignment, and synchronization of cell line information between CLO and EFO

Affiliations

Comparison, alignment, and synchronization of cell line information between CLO and EFO

Edison Ong et al. BMC Bioinformatics. .

Abstract

Background: The Experimental Factor Ontology (EFO) is an application ontology driven by experimental variables including cell lines to organize and describe the diverse experimental variables and data resided in the EMBL-EBI resources. The Cell Line Ontology (CLO) is an OBO community-based ontology that contains information of immortalized cell lines and relevant experimental components. EFO integrates and extends ontologies from the bio-ontology community to drive a number of practical applications. It is desirable that the community shares design patterns and therefore that EFO reuses the cell line representation from the Cell Line Ontology (CLO). There are, however, challenges to be addressed when developing a common ontology design pattern for representing cell lines in both EFO and CLO.

Results: In this study, we developed a strategy to compare and map cell line terms between EFO and CLO. We examined Cellosaurus resources for EFO-CLO cross-references. Text labels of cell lines from both ontologies were verified by biological information axiomatized in each source. The study resulted in the identification 873 EFO-CLO aligned and 344 EFO unique immortalized permanent cell lines. All of these cell lines were updated to CLO and the cell line related information was merged. A design pattern that integrates EFO and CLO was also developed.

Conclusion: Our study compared, aligned, and synchronized the cell line information between CLO and EFO. The final updated CLO will be examined as the candidate ontology to import and replace eligible EFO cell line classes thereby supporting the interoperability in the bio-ontology domain. Our mapping pipeline illustrates the use of ontology in aiding biological data standardization and integration through the biological and semantics content of cell lines.

Keywords: Cell line; Cell line ontology; Data integration; Data mapping; Experimental factor ontology.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
General project workflow. The general pipeline of comparing, aligning and synchronizing cell lines between EFO (version 2.85) and CLO (version 2.1.106) with additional information downloaded from Cellosaurus (version 22.0).The workflow was separated into four major steps (indicated as blue circles). Step 1: Cell lines and related biological information were downloaded and extracted from EFO, CLO and Cellosaurus. Step 2: Comparison and alignment of EFO cell lines to CLO through three intermediate processes: (i) EFO-Cellosaurus-CLO Mapping that performed cross-references and validations among the three cell line resources; (ii) Direct EFO-CLO Mapping that compared and mapped EFO cell lines to all CLO cell lines; (iii) Identification of EFO unique Cell Lines that were immortalized permanent cell lines not available in CLO. The results of (i)-(iii) were summarized in Fig. 4.EFO cell lines with foreign (non-EFO) namespace such as Brenda Tissue Ontology (BTO), stem cell lines and primary cell lines were excluded from the mapping and remained in EFO. The overall mapping result was summarized in Table 1. Step 3: The mapped EFO-CLO cell lines and EFO unique immortalized permanent cell lines would be merged or added to CLO. Step 4: The updated and synchronized CLO will later be imported to EFO immortalized permanent cell line module
Fig. 2
Fig. 2
Comparison of EFO and CLO cell line design patterns. The EFO cell line design pattern was colored in orange and CLO in blue. The green color indicated cell line related information and design pattern shared in both EFO and CLO design patterns
Fig. 3
Fig. 3
Example EFO-CLO cell line mapping recovered by Cellosaurus disease definition and semantics matching. There were three different disease definitions (“lung carcinoma” in EFO, “lung adenocarcinoma” in Cellosaurus and “adenocarcinoma” in CLO) for the cell line “NCI-H2087”. The direct mapping using cell line annotations, disease and species of origin would have gone undetected if we directly compared EFO cell line disease information to CLO’s information. Discrepancies of cell line-disease annotation can be recovered through EFO-Cellosaurus-CLO disease semantic relations
Fig. 4
Fig. 4
Mapping results of the three intermediate processes (i)-(iii) in Step 2 of the overall workflow. (i) EFO-Cellosaurus-CLO Mapping shown with diamond pattern. (ii) Direct EFO-CLO Mapping was the region with diamond pattern. In this intermediate process. (iii) Identify EFO Unique Cell lines was the shaded region overlapped with (ii). All the unmatched EFO cell lines from (ii) were processed in (iii) to identify EFO unique immortalized permeant cell lines that would be added to CLO
Fig. 5
Fig. 5
MCF 10A design patterns in EFO and CLO. The EFO cell line design pattern was colored in orange and CLO in blue. The white text inside each box showed the aligned cell line “MCF 10A” (EFO accession: EFO_0001200; CLO accession: CLO_0007599) with different axiomatized biological information. Additionally, “MCF 10A” cell line cell was defined as “subclass of” CLO “immortal human breast epithelial cell line cell”, which added further biological information after hierarchy inference

References

    1. Malone J, Holloway E, Adamusiak T, Kapushesky M, Zheng J, Kolesnikov N, et al. Modeling sample variables with an experimental factor ontology. Bioinformatics. 2010;26:1112–1118. doi: 10.1093/bioinformatics/btq099. - DOI - PMC - PubMed
    1. Köhler S, Vasilevsky NA, Engelstad M, Foster E, McMurry J, Aymé S, et al. The human phenotype ontology in 2017. Nucleic Acids Res. 2017;45:D865–D876. doi: 10.1093/nar/gkw1039. - DOI - PMC - PubMed
    1. Mungall CJ, Torniai C, Gkoutos GV, Lewis SE, Haendel MA. Uberon, an integrative multi-species anatomy ontology. Genome Biol [Internet]. 2012;13:R5. Available from: http://genomebiology.biomedcentral.com/articles/10.1186/gb-2012-13-1-r5. - DOI - PMC - PubMed
    1. Kolesnikov N, Hastings E, Keays M, Melnichuk O, Tang YA, Williams E, et al. ArrayExpress update-simplifying data submissions. Nucleic Acids Res. 2015;43:D1113–D1116. doi: 10.1093/nar/gku1057. - DOI - PMC - PubMed
    1. Sarntivijai S, Lin Y, Xiang Z, Meehan TF, Diehl AD, Vempati UD, et al. CLO: the cell line ontology. J Biomed Semantics [Internet]. 2014;5:37. Available from: http://jbiomedsem.biomedcentral.com/articles/10.1186/2041-1480-5-37. - DOI - PMC - PubMed

Publication types

LinkOut - more resources