Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb:5:25-30.
doi: 10.1016/j.comtox.2017.04.001. Epub 2017 Apr 28.

Species translatable blood gene signature as a marker of exposure to smoking: computational approaches of the top ranked teams in the sbv IMPROVER Systems Toxicology challenge

Affiliations

Species translatable blood gene signature as a marker of exposure to smoking: computational approaches of the top ranked teams in the sbv IMPROVER Systems Toxicology challenge

Ömer Sinan Saraç et al. Comput Toxicol. 2018 Feb.

Abstract

Crowdsourcing has been used to address computational challenges in systems biology and assess translation of findings across species. Sub-challenge 2 of the sbv IMPROVER Systems Toxicology Challenge was designed to determine whether a common set of genes can be used to identify exposure to cigarette smoke in both human and mouse. Participating teams used a training set of human and mouse blood gene expression data to derive parsimonious models (up to 40 genes) that classify subjects into exposure groups: smokers, former smokers, and never-smokers. Teams were ranked based on two classification performance metrics evaluated on a blinded test dataset. Prediction of current exposure to cigarette smoke in human and mouse by a common prediction model was achieved by the top ranked team (Team 219) with 89% balanced accuracy (BAC), while past exposure was predicted with only 57% BAC. The prediction model of the top ranked team was a random forest classifier trained on sets of genes that appeared best for each species separately with no overlap between species. By contrast, Team 264, ranked second (tied with Team 250), selected genes that were simultaneously predictive in both species and achieved 80% and 59% BAC when predicting current and past exposure, respectively. These performance values were lower than the 96.5% and 61% BAC estimates for current and past exposure, respectively, obtained by Team 264 (top ranked in sub-challenge 1) when using only human data. Unlike past exposure, current exposure to cigarette smoke can be accurately assessed in both human and mouse with a common prediction model based on blood mRNAs. However, requiring a common gene signature to be predictive in both species resulted in a substantial decrease in balanced accuracy for prediction of current exposure to cigarette smoke (from 96.5% to 80%), suggesting species-specific responses exist.

Keywords: Systems toxicology; computational challenge; predictive modeling; smoking biomarker; species-translatable gene signature.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Classification performance of the six teams with valid submissions in the sbv IMPROVER Systems Toxicology sub-challenge 2
Data shown represent the ranks (1–6, the smaller the better) for two prediction performance metrics (area under the precision-recall curve, AUPR, and Mathew’s correlation coefficient, MCC) in two classification tasks (smoker vs non-current smoker and former smoker vs never-smoker). The final team ranking is based on the sum of the four individual ranks.
Figure 2
Figure 2. Classification confidence values for the top three teams in the sbv IMPROVER Systems Toxicology sub-challenge 2
Data shown represent the confidence (0.0–1.0) that blood gene expression profiles belonged to a smoker (left) or former smoker (right). Distribution boxplots are shown by actual smoking status. Thick horizontal lines in the boxes represent median values, while the boxes encompass the first and third quartile.

References

    1. Saez-Rodriguez J, Costello JC, Friend SH, Kellen MR, Mangravite L, Meyer P, et al. Crowdsourcing biomedical research: leveraging communities as innovation engines. Nat Rev Genet. 2016;17:470–486. - PMC - PubMed
    1. Meyer P, Alexopoulos LG, Bonk T, Califano A, Cho CR, de la Fuente A, et al. Verification of systems biology research in the age of collaborative competition. Nat Biotechnol. 2011;29:811–815. - PubMed
    1. Rhrissorrakrai K, Belcastro V, Bilal E, Norel R, Poussin C, Mathis C, et al. Understanding the limits of animal models as predictors of human biology: lessons learned from the sbv IMPROVER Species Translation Challenge. Bioinformatics under review 2014 - PMC - PubMed
    1. Ahuja V, Sharma S. Drug safety testing paradigm, current progress and future challenges: an overview. J Appl Toxicol. 2014;34:576–594. - PubMed
    1. Chen S, Xuan J, Couch L, Iyer A, Wu Y, Li QZ, et al. Sertraline induces endoplasmic reticulum stress in hepatic cells. Toxicology. 2014;322C:78–88. - PMC - PubMed

LinkOut - more resources