PheSom: a term frequency-based method for measuring human phenotype similarity on the basis of MeSH vocabulary
- PMID: 37496714
- PMCID: PMC10366691
- DOI: 10.3389/fgene.2023.1185790
PheSom: a term frequency-based method for measuring human phenotype similarity on the basis of MeSH vocabulary
Abstract
Background: Phenotype similarity calculation should be used to help improve drug repurposing. In this study, based on the MeSH terms describing the phenotypes deposited in OMIM, we proposed a method, namely, PheSom (Phenotype Similarity On MeSH), to measure the similarity between phenotypes. PheSom counted the number of overlapping MeSH terms between two phenotypes and then took the weight of every MeSH term within each phenotype into account according to the term frequency-inverse document frequency (FIDC). Phenotype-related genes were used for the evaluation of our method. Results: A 7,739 × 7,739 similarity score matrix was finally obtained and the number of phenotype pairs was dramatically decreased with the increase of similarity score. Besides, the overlapping rates of phenotype-related genes were remarkably increased with the increase of similarity score between phenotypes, which supports the reliability of our method. Conclusion: We anticipate our method can be applied to identifying novel therapeutic methods for complex diseases.
Keywords: FIDC; OMIM; mesh; phenotype; similarity score.
Copyright © 2023 Liu, Gao, Peng, Fang and Wang.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures




Similar articles
-
CSI-OMIM--Clinical Synopsis Search in OMIM.BMC Bioinformatics. 2011 Mar 1;12:65. doi: 10.1186/1471-2105-12-65. BMC Bioinformatics. 2011. PMID: 21362185 Free PMC article.
-
Literature-based drug-drug similarity for drug repurposing: impact of Medical Subject Headings term refinement and hierarchical clustering.Future Med Chem. 2022 Sep;14(18):1309-1323. doi: 10.4155/fmc-2022-0074. Epub 2022 Aug 26. Future Med Chem. 2022. PMID: 36017692
-
[Review on the research progress of mining of OMIM data].Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2014 Dec;31(6):1400-4. Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2014. PMID: 25868267 Review. Chinese.
-
pyMeSHSim: an integrative python package for biomedical named entity recognition, normalization, and comparison of MeSH terms.BMC Bioinformatics. 2020 Jun 18;21(1):252. doi: 10.1186/s12859-020-03583-6. BMC Bioinformatics. 2020. PMID: 32552728 Free PMC article.
-
Computational Methods for Identifying Similar Diseases.Mol Ther Nucleic Acids. 2019 Dec 6;18:590-604. doi: 10.1016/j.omtn.2019.09.019. Epub 2019 Sep 28. Mol Ther Nucleic Acids. 2019. PMID: 31678735 Free PMC article. Review.
References
LinkOut - more resources
Full Text Sources