Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jun 21;19(1):225.
doi: 10.1186/s12859-018-2224-0.

Epigenetic machine learning: utilizing DNA methylation patterns to predict spastic cerebral palsy

Affiliations

Epigenetic machine learning: utilizing DNA methylation patterns to predict spastic cerebral palsy

Erin L Crowgey et al. BMC Bioinformatics. .

Abstract

Background: Spastic cerebral palsy (CP) is a leading cause of physical disability. Most people with spastic CP are born with it, but early diagnosis is challenging, and no current biomarker platform readily identifies affected individuals. The aim of this study was to evaluate epigenetic profiles as biomarkers for spastic CP. A novel analysis pipeline was employed to assess DNA methylation patterns between peripheral blood cells of adolescent subjects (14.9 ± 0.3 years old) with spastic CP and controls at single CpG site resolution.

Results: Significantly hypo- and hyper-methylated CpG sites associated with spastic CP were identified. Nonmetric multidimensional scaling fully discriminated the CP group from the controls. Machine learning based classification modeling indicated a high potential for a diagnostic model, and 252 sets of 40 or fewer CpG sites achieved near-perfect accuracy within our adolescent cohorts. A pilot test on significantly younger subjects (4.0 ± 1.5 years old) identified subjects with 73% accuracy.

Conclusions: Adolescent patients with spastic CP can be distinguished from a non-CP cohort based on DNA methylation patterns in peripheral blood cells. A clinical diagnostic test utilizing a panel of CpG sites may be possible using a simulated classification model. A pilot validation test on patients that were more than 10 years younger than the main adolescent cohorts indicated that distinguishing methylation patterns are present earlier in life. This study is the first to report an epigenetic assay capable of distinguishing a CP cohort.

Keywords: Cerebral palsy; Computational statistics; DNA methylation; Epigenetic biomarkers; Genomics.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Twenty-two subjects with a diagnosis of spastic CP and 21 control subjects were enrolled in this IRB-approved study at the Nemours - Alfred I. duPont Hospital for Children after written informed parental consent and subject assent. Approval for the study and the assent and consent documents was provided by Nemours Institutional Review Board 1, which has full accreditation from the Association for the Accreditation of Human Research Protection Programs and operates under the Federal Wide Assurance identification number FWA00000293.

Competing interests

The technology and software platform used to assess DNA methylation patterns were developed by Adam Marsh in his capacity as a professor of Computational Biology and Bioinformatics at the University of Delaware. The intellectual property is licensed by Genome Profiling LLC from the University. Author Adam Marsh declares a financial interest in Genome Profiling LLC.

Authors Erin Crowgey, Karyn Robinson, Stephanie Yeager, and Robert Akins have no competing interests.

Figures

Fig. 1
Fig. 1
Statistical Methylation Patterns. a Non-Metric multidimensional scaling to identify discriminating cytosine methylation patterns between CP and non-CP cohorts. The first two component axes were plotted to locate the individual subject points in a relative 2D plane. Each point represents the similarity position of a subject based on all potentially informative CpG sites (n = 61,278). CP = orange points; controls = green points. Ellipses represent 90% confidence intervals. The complete segregation of the two cohorts indicates that DNA methylation patterns fundamentally differ between the cohorts. b Comparison of differential methylation load by KEGG functional classification and domain structure. ∆ML (mean difference between CP and control groups) was calculated across the defined length of the gene body structure for six top-level KEGG Pathway Map Classifications: Cellular Processes, Human Diseases, Environmental Information Processing, Genetic Information Processing, Metabolism, and Organismal Systems. Positive ∆ML numbers indicate higher methylation in control subjects and negative numbers indicate higher methylation in CP subjects. Assessing ∆ML score demonstrated a prevalence of altered methylation in 5’ UTR regions for three of the hierarchical KEGG functional categories. Values plotted are means +/− SEM across the number of genes scored in each category
Fig. 2
Fig. 2
CpG Statistical Comparisons Based on a Likelihood-Ratio-Test of a One Way ANOVA contrast. a Volcano plot with the frequency profile of p-values. Data from the 1.47 million CpG sites in common across all samples are plotted. The x-axis is the log value for the fold-change (ratio) of CP to non-CP CpG site methylation values and the y-axis is the log FDR value (gray = not significant, orange = p-value significant, and red = p-value after FDR significant). b Heatmap Clustering of the top 200 CpG sites selected based on statistical p-value. There were 6588 CpG sites that were significantly different (p < 0.05 after false-discovery rate correction). Hierarchical clustering based on % methylationwas employed using the 200 CpG sites with the lowest p-values. Quantitative differences in CpG site methylation by diagnosis were apparent. Each row represents the score for a single CpG site across all subjects
Fig. 3
Fig. 3
CpG Methylation Load Ideogram Comparing CP and Control Cohorts. Mean differences in CpG methylation scores (∆ML; control minus CP) were used to calculate a summation methylation load score at 1 Mbp intervals. ∆ML is presented as the inside track using a scatter plot to show higher methylation in controls (green), higher methylation in CP (red), and equivalent methylation in both (gray; abs|∆ML| is less than twice the average ∆ML for the whole genome). The particular “hotspots” that appear in chromosomes 9, 18, 19, and 22 could indicate allelic compositional differences and potential gene targets for future functional and validation studies. The chromosomal locations of the top 200 CpG sites are indicated in two rings by tick marks labeled with the gene name (or “NA” if there is no annotation) in which the CpG site is located. Those CpG sites that had significantly higher methylation levels in the controls are in green. Those CpG sites that had significantly higher methylation levels in the CP subjects are in red. The distribution of sites appears skewed toward chromosomes 11 to 22
Fig. 4
Fig. 4
Receiver Operator Curves for Model Classification Test. A bootstrap classification model was executed using a linear discriminant analysis (LDA) guided by a machine-learning algorithm. a A test group comprising young children (approx. 4 years old; n = 11) was used with discriminant scores from each LDA normalized to a center point of 5. The majority of model “votes” either < 5 or > 5 was used to classify each sample. Green and red marks indicate correct and incorrect identifications, respectively. b Receiver operator characteristic (ROC) curves for an iterative theoretical yield (blue dashed line) and the actual yield from the classification tests of the 1–5 yo group (green line). Here, overall accuracy was 73% with a sensitivity of 100%, specificity of 40%, and an area under the curve (AUC) of 0.691. The performance of this dynamic classification analysis suggests that there is high discrimination power that could be developed for diagnostic detection of spastic CP

Similar articles

Cited by

References

    1. Basu AP, Clowry G. Improving outcomes in cerebral palsy with early intervention: new translational approaches. Front Neurol. 2015;6:24. doi: 10.3389/fneur.2015.00024. - DOI - PMC - PubMed
    1. Hadders-Algra M. Early diagnosis and early intervention in cerebral palsy. Front Neurol. 2014;5:185. doi: 10.3389/fneur.2014.00185. - DOI - PMC - PubMed
    1. Hubermann L, Boychuck Z, Shevell M, Majnemer A. Age at referral of children for initial diagnosis of cerebral palsy and rehabilitation: current practices. J Child Neurol. 2016;31(3):364–369. doi: 10.1177/0883073815596610. - DOI - PubMed
    1. Spittle A, Orton J, Anderson PJ, Boyd R, Doyle LW. Early developmental intervention programmes provided post hospital discharge to prevent motor and cognitive impairment in preterm infants. Cochrane Database Syst Rev. 2015;11:CD005495. - PMC - PubMed
    1. Christensen D, Van Naarden Braun K, Doernberg NS, Maenner MJ, Arneson CL, Durkin MS, Benedict RE, Kirby RS, Wingate MS, Fitzgerald R, et al. Prevalence of cerebral palsy, co-occurring autism spectrum disorders, and motor functioning - autism and developmental disabilities monitoring network, USA, 2008. Dev Med Child Neurol. 2014;56(1):59–65. doi: 10.1111/dmcn.12268. - DOI - PMC - PubMed

Publication types