Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar 13;51(1):10.
doi: 10.1186/s12711-019-0453-y.

Machine learning applied to transcriptomic data to identify genes associated with feed efficiency in pigs

Affiliations

Machine learning applied to transcriptomic data to identify genes associated with feed efficiency in pigs

Miriam Piles et al. Genet Sel Evol. .

Abstract

Background: To date, the molecular mechanisms that underlie residual feed intake (RFI) in pigs are unknown. Results from different genome-wide association studies and gene expression analyses are not always consistent. The aim of this research was to use machine learning to identify genes associated with feed efficiency (FE) using transcriptomic (RNA-Seq) data from pigs that are phenotypically extreme for RFI.

Methods: RFI was computed by considering within-sex regression on mean metabolic body weight, average daily gain, and average backfat gain. RNA-Seq analyses were performed on liver and duodenum tissue from 32 high and 33 low RFI pigs collected at 153 d of age. Machine-learning algorithms were used to predict RFI class based on gene expression levels in liver and duodenum after adjusting for batch effects. Genes were ranked according to their contribution to the classification using the permutation accuracy importance score in an unbiased random forest (RF) algorithm based on conditional inference. Support vector machine, RF, elastic net (ENET) and nearest shrunken centroid algorithms were tested using different subsets of the top rank genes. Nested resampling for hyperparameter tuning was implemented with tenfold cross-validation in the outer and inner loops.

Results: The best classification was obtained with ENET using the expression of 200 genes in liver [area under the receiver operating characteristic curve (AUROC): 0.85; accuracy: 0.78] and 100 genes in duodenum (AUROC: 0.76; accuracy: 0.69). Canonical pathways and candidate genes that were previously reported as associated with FE in several species were identified. The most remarkable pathways and genes identified were NRF2-mediated oxidative stress response and aldosterone signalling in epithelial cells, the DNAJC6, DNAJC1, MAPK8, PRKD3 genes in duodenum, and melatonin degradation II, PPARα/RXRα activation, and GPCR-mediated nutrient sensing in enteroendocrine cells and SMOX, IL4I1, PRKAR2B, CLOCK and CCK genes in liver.

Conclusions: ML algorithms and RNA-Seq expression data were found to provide good performance for classifying pigs into high or low RFI groups. Classification was better with gene expression data from liver than from duodenum. Genes associated with FE in liver and duodenum tissue that can be used as predictive biomarkers for this trait were identified.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Least square means and confidence intervals for residual feed intake (RFI) by sex and RFI class (low and high)
Fig. 2
Fig. 2
Principal components 1 (PC1) and 2 (PC2) of RNA-seq expression data from liver of animals under different feeding regimes (FR class) and of different sex, before (a) and after (b) adjusting for PC1. VarExp: percentage of total variance explained by the principal component
Fig. 3
Fig. 3
Principal components 1 (PC1) and 2 (PC2) of RNA-seq expression data from duodenum of animals under different feeding regimes (FR class) and of different sex, before (a) and after (b) adjusting for PC1. VarExp: percentage of total variance explained by the principal component
Fig. 4
Fig. 4
Boxplot of AUROC of the classification of pigs on RFI in 10 test sets (from tenfold cross-validations). Classification was based on liver RNA-Seq expression data corresponding to different subsets of genes (50, 75, 100 and 125), either raw or pre-corrected by batch effects (suffix “pcs”), and was performed using support vector machine (SVM), elastic net (ENET), nearest shrunken centroids (PAM) and random forest (RF) algorithms
Fig. 5
Fig. 5
Boxplot of AUROC of the classification of pigs on RFI in 10 test sets (from tenfold cross-validations). Classification was based on liver RNA-Seq expression data corresponding to different subsets of genes (50, 75, 100, 125, 150, 200, 250, 300, 350 and 400), pre-corrected by batch effects, and was performed using support vector machine (SVM), elastic net (ENET), nearest shrunken centroids (PAM) and random forest (RF) algorithms
Fig. 6
Fig. 6
Boxplot of AUROC of the classification of pigs on RFI in 10 test sets (from tenfold cross validations). Classification was based on duodenum RNA-Seq expression data corresponding to different subsets of genes (50, 75, 100 and 125), either raw or pre-corrected by batch effects (suffix “pcs”), and was performed using support vector machine (SVM), elastic net (ENET), nearest shrunken centroids (PAM) and random forest (RF) algorithms
Fig. 7
Fig. 7
Importance of the 40 top genes contributing to the classification of samples into the high or low RFI class. a Liver. b Duodenum

References

    1. Cai W, Casey DS, Dekkers JCM. Selection response and genetic parameters for residual feed intake in Yorkshire swine. J Anim Sci. 2008;86:287–298. doi: 10.2527/jas.2007-0396. - DOI - PubMed
    1. Gilbert H, Bidanel JP, Billon Y, Lagant H, Guillouet P, Sellier P, et al. Correlated responses in sow appetite, residual feed intake, body composition, and reproduction after divergent selection for residual feed intake in the growing pig. J Anim Sci. 2012;90:1097–1108. doi: 10.2527/jas.2011-4515. - DOI - PubMed
    1. Crews DHD., Jr Genetics of efficient feed utilization and national cattle evaluation: a review. Genet Mol Res. 2005;4:152–165. - PubMed
    1. Yuan J, Dou T, Ma M, Yi G, Chen S, Qu LJ. Genetic parameters of feed efficiency traits in laying period of chickens. Poult Sci. 2015;94:1470–1475. doi: 10.3382/ps/pev122. - DOI - PMC - PubMed
    1. Molette C, Gilbert H, Larzul C, Balmisse E, Ruesche J, Manse H, et al. Direct and correlated responses to selection in two lines of rabbits selected for feed efficiency under ad libitum and restricted feeding: II. Carcass and meat quality. J Anim Sci. 2016;94:49–57. doi: 10.2527/jas.2015-9403. - DOI - PubMed