Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 12;14(20):2947.
doi: 10.3390/ani14202947.

Comparative Transcriptome Analysis of Bovine, Porcine, and Sheep Muscle Using Interpretable Machine Learning Models

Affiliations

Comparative Transcriptome Analysis of Bovine, Porcine, and Sheep Muscle Using Interpretable Machine Learning Models

Yaqiang Guo et al. Animals (Basel). .

Abstract

The growth and development of muscle tissue play a pivotal role in the economic value and quality of meat in agricultural animals, garnering close attention from breeders and researchers. The quality and palatability of muscle tissue directly determine the market competitiveness of meat products and the satisfaction of consumers. Therefore, a profound understanding and management of muscle growth is essential for enhancing the overall economic efficiency and product quality of the meat industry. Despite this, systematic research on muscle development-related genes across different species still needs to be improved. This study addresses this gap through extensive cross-species muscle transcriptome analysis, combined with interpretable machine learning models. Utilizing a comprehensive dataset of 275 publicly available transcriptomes derived from porcine, bovine, and ovine muscle tissues, encompassing samples from ten distinct muscle types such as the semimembranosus and longissimus dorsi, this study analyzes 113 porcine (n = 113), 94 bovine (n = 94), and 68 ovine (n = 68) specimens. We employed nine machine learning models, such as Support Vector Classifier (SVC) and Support Vector Machine (SVM). Applying the SHapley Additive exPlanations (SHAP) method, we analyzed the muscle transcriptome data of cattle, pigs, and sheep. The optimal model, adaptive boosting (AdaBoost), identified key genes potentially influencing muscle growth and development across the three species, termed SHAP genes. Among these, 41 genes (including NANOG, ADAMTS8, LHX3, and TLR9) were consistently expressed in all three species, designated as homologous genes. Specific candidate genes for cattle included SLC47A1, IGSF1, IRF4, EIF3F, CGAS, ZSWIM9, RROB1, and ABHD18; for pigs, DRP2 and COL12A1; and for sheep, only COL10A1. Through the analysis of SHAP genes utilizing Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, relevant pathways such as ether lipid metabolism, cortisol synthesis and secretion, and calcium signaling pathways have been identified, revealing their pivotal roles in muscle growth and development.

Keywords: SHAP; comparative transcriptomics; key genes; machine learning; muscle growth and development.

PubMed Disclaimer

Conflict of interest statement

This study was conducted in the absence of any commercial or financial relationships that might be perceived as potential conflicts of interest.

Figures

Figure 1
Figure 1
Preprocessing of muscle transcript sample expression data prior to model construction. (a) Principal component analysis plot used for filtering out low-expression genes. (b) Principal component analysis plot after the removal of low-expression genes, batch correction, and normalization. (c) Quality control of expression data and average gene expression analysis per 10% gradient.
Figure 2
Figure 2
Evaluation of the classification performance of nine machine learning models on the test set. (a) Support Vector Classifier (SVC), (b) Support Vector Machine (SVM), (c) deep neural network (DNN), (d) recurrent neural network (RNN), (e) logistic regression (LR), (f) decision tree (DT), (g) k-nearest neighbors (KNN), (h) Naive Bayes (NB), (i) AdaBoost. In the confusion matrix, the modules where the species names correspond both horizontally and vertically represent the model’s accurate predictions of the actual conditions. The modules on either side of the diagonal indicate erroneous predictions of species names. Each species in the test set has a transcription sample size of thirty.
Figure 3
Figure 3
Analysis of SHAP genes in three species: cattle, pig, and sheep. (a) Venn diagram illustrating the distribution of SHAP genes among different species. (b) Expression levels of 41 homologous genes in the three species depicted on a bar graph, with gene names labeled on the horizontal axis and the corresponding expression levels on the vertical axis. (c) Correlation analysis among 41 genes in the bovine muscle transcriptome is presented, with gene names labeled on both the horizontal and vertical axes. (d) Correlation analysis among 41 genes in the porcine muscle transcriptome is shown. (e) Correlation analysis among 41 genes in the sheep muscle transcriptome is displayed.
Figure 4
Figure 4
Intraspecies mean expression and interspecies correlation analysis of bovine, porcine, and sheep-specific SHAP genes. (a) Analysis of the average expression of bovine muscle-specific SHAP genes. (b) Analysis of the average expression of pig muscle-specific SHAP genes. (c) Analysis of the average expression of sheep muscle-specific SHAP genes. (d) Analysis of the average expression of SHAP genes specific to pigs and sheep. (e) Analysis of the correlation among SHAP genes across bovine, porcine, and sheep species.
Figure 5
Figure 5
Presentation of the KEGG and GO analyses of the 41 homologous genes identified by the interpretable ML model SHAP across cattle, pig, and sheep species. Panel (a) shows the results of the KEGG enrichment analysis for the 41 homologous genes, while panel (b) displays the GO analysis findings for these genes.
Figure 6
Figure 6
WGCNA was conducted among three species: cattle, pigs, and sheep. (a) Eight modules were retained after the removal of low-quality genes and samples. (b) The number of genes contained in each module and the percentage of the total number of genes. (c) A heatmap illustrating the correlation between the three species of cattle, pigs, and sheep with the eight characterized modules after the addition of species phenotypes.
Figure 7
Figure 7
PPI network analysis of 49, 49, and 48 SHAP genes in muscle of cattle, pig, and sheep. (a) PPI network map of 15 key genes in 49 SHAP genes in cattle. (b) PPI network maps of 12 key genes in 49 SHAP genes in pigs. (c) PPI network map of 8 key genes in 48 SHAP genes of sheep.

Similar articles

Cited by

References

    1. McLaren D., Novakofski J., Parrett D., Lo L., Singh S., Neumann K., McKeith F. A study of operator effects on ultrasonic measures of fat depth and longissimus muscle area in cattle, sheep and pigs. J. Anim. Sci. 1991;69:54–66. doi: 10.2527/1991.69154x. - DOI - PubMed
    1. Picard B., Gagaoua M. Muscle fiber properties in cattle and their relationships with meat qualities: An overview. J. Agric. Food Chem. 2020;68:6021–6039. doi: 10.1021/acs.jafc.0c02086. - DOI - PubMed
    1. Talebi R., Ghaffari M.R., Zeinalabedini M., Abdoli R., Mardi M. Genetic basis of muscle-related traits in sheep: A review. Anim. Genet. 2022;53:723–739. doi: 10.1111/age.13266. - DOI - PubMed
    1. Pethick D., Hocquette J.-F., Scollan N., Dunshea F. Improving the nutritional, sensory and market value of meat products from sheep and cattle. Animal. 2021;15:100356. doi: 10.1016/j.animal.2021.100356. - DOI - PubMed
    1. Fitwi M., Tadesse G. Effect of sesame cake supplementation on feed intake, body weight gain, feed conversion efficiency and carcass parameters in the ration of sheep fed on wheat bran and teff (Eragrostis teff) straw. Momona Ethiop. J. Sci. 2013;5:89–106. doi: 10.4314/mejs.v5i1.85333. - DOI

LinkOut - more resources