Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Feb 16:12:619857.
doi: 10.3389/fgene.2021.619857. eCollection 2021.

Identification of Predictor Genes for Feed Efficiency in Beef Cattle by Applying Machine Learning Methods to Multi-Tissue Transcriptome Data

Affiliations

Identification of Predictor Genes for Feed Efficiency in Beef Cattle by Applying Machine Learning Methods to Multi-Tissue Transcriptome Data

Weihao Chen et al. Front Genet. .

Abstract

Machine learning (ML) methods have shown promising results in identifying genes when applied to large transcriptome datasets. However, no attempt has been made to compare the performance of combining different ML methods together in the prediction of high feed efficiency (HFE) and low feed efficiency (LFE) animals. In this study, using RNA sequencing data of five tissues (adrenal gland, hypothalamus, liver, skeletal muscle, and pituitary) from nine HFE and nine LFE Nellore bulls, we evaluated the prediction accuracies of five analytical methods in classifying FE animals. These included two conventional methods for differential gene expression (DGE) analysis (t-test and edgeR) as benchmarks, and three ML methods: Random Forests (RFs), Extreme Gradient Boosting (XGBoost), and combination of both RF and XGBoost (RX). Utility of a subset of candidate genes selected from each method for classification of FE animals was assessed by support vector machine (SVM). Among all methods, the smallest subsets of genes (117) identified by RX outperformed those chosen by t-test, edgeR, RF, or XGBoost in classification accuracy of animals. Gene co-expression network analysis confirmed the interactivity existing among these genes and their relevance within the network related to their prediction ranking based on ML. The results demonstrate a great potential for applying a combination of ML methods to large transcriptome datasets to identify biologically important genes for accurately classifying FE animals.

Keywords: Bos indicus; Extreme Gradient Boosting; RNA-seq; Random Forest; co-expression network; residual feed intake; supporting vector machine.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Heatmap of cluster analysis using DGE identified by the RX in muscle (A) and pituitary gland (B). H refers to HFE bulls and L refers to LFE bulls.
FIGURE 2
FIGURE 2
Co-expression network of predictor genes identified by the RX in adrenal gland (A), hypothalamus (B), liver (C), muscle (D), and pituitary (E). Deep color (blue) and bigger circles indicate the genes with stronger control power (higher betweenness value) over the network.
FIGURE 3
FIGURE 3
Co-expression networks in LFE (A) and HFE (B), colors are relative to the tissue of maximum expression: yellow represents liver, green represents muscle, orange represents pituitary, purple represents hypothalamus, and blue represents adrenal gland. The results are based on the genes selected by the RX.
FIGURE 4
FIGURE 4
Connections of each shared gene in LFE and HFE. The results are based on the genes selected by the RX.

References

    1. Abasht B., Zhou N., Lee W. R., Zhuo Z., Peripolli E. (2019). The metabolic characteristics of susceptibility to wooden breast disease in chickens with high feed efficiency. Poult. Sci. 98 3246–3256. 10.3382/ps/pez183 - DOI - PubMed
    1. Abo-Ismail M. K., Kelly M. J., Squires E. J., Swanson K. C., Bauck S., Miller S. P. (2013). Identification of single nucleotide polymorphisms in genes involved in digestive and metabolic processes associated with feed efficiency and performance traits in beef cattle. J. Anim. Sci. 91 2512–2529. 10.2527/jas.2012-5756 - DOI - PubMed
    1. Alexandre P. A., Kogelman L. J. A., Santana M. H., Passarelli D., Pulz L. H., Fantinato-Neto P., et al. (2015). Liver transcriptomic networks reveal main biological processes associated with feed efficiency in beef cattle. BMC Genomics 16:1073. 10.1186/s12864-015-2292-8 - DOI - PMC - PubMed
    1. Alexandre P. A., Naval-Sanchez M., Porto-Neto L. R., Ferraz J., Reverter A., Fukumasu H. (2019). Systems biology reveals NR2F6 and TGFB1 as key regulators of feed efficiency in beef cattle. Front. Genet. 10:230. 10.3389/fgene.2019.00230 - DOI - PMC - PubMed
    1. Archer J. A., Arthur P. F., Herd R. M., Parnell P. F., Pitchford W. S. (1997). Optimum postweaning test for measurement of growth rate, feed intake, and feed efficiency in British breed cattle. J. Anim. Sci. 75 2024–2032. 10.2527/1997.7582024x - DOI - PubMed

LinkOut - more resources