Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;7(6):e38230.
doi: 10.1371/journal.pone.0038230. Epub 2012 Jun 5.

Visualization of genomic changes by segmented smoothing using an L0 penalty

Affiliations

Visualization of genomic changes by segmented smoothing using an L0 penalty

Ralph C A Rippe et al. PLoS One. 2012.

Abstract

Copy number variations (CNV) and allelic imbalance in tumor tissue can show strong segmentation. Their graphical presentation can be enhanced by appropriate smoothing. Existing signal and scatterplot smoothers do not respect segmentation well. We present novel algorithms that use a penalty on the L(0) norm of differences of neighboring values. Visualization is our main goal, but we compare classification performance to that of VEGA.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Illustrations of copy numbers and allelic ratio, expressed as logarithms, for healthy and tumor tissue.
Left panels: healthy tissue. Right panels: tumor tissue. Top row: copy numbers. Bottom row: allelic imbalance.
Figure 2
Figure 2. Illustration of smoothing with different norms (2,1,0) in the roughness penalty.
Top panel: formula image norm, the Whittaker smoother. Middle panel: formula image norm. Bottom panel: formula image norm. Thinner lines drawn with positive and negative offsets illustrate the effect non-optimal formula image. Top line: formula image too large. Bottom line: formula image too small.
Figure 3
Figure 3. Odd-even cross-validation for finding an optimal .
The selected formula image is indicated in the top panel by the vertical broken line. The bottom panel shows data using (double) the selected formula image against the raw data. The doubling is needed to compensate for leaving out half of the data.
Figure 4
Figure 4. Illustration of convergence behavior in zero-norm smoothing with little noise.
The data are simulated (VEGA package) and contain relatively little noise. All panels, except the lower-right one, show intermediate solutions, at the iteration numbers as indicated in the titles of the panels. The lower right panel shows the largest absolute change in the solution at each iteration. The smoothing parameter is set to formula image.
Figure 5
Figure 5. Illustration of convergence behavior in zero-norm smoothing with moderate noise.
Illustration of convergence behavior. The data are simulated (VEGA package) and contain relatively much noise. All panels, except the lower-right one show intermediate solutions, at the iteration numbers as indicated in the titles of the panels. The lower right panel shows the largest absolute change in the solution at each iteration. The smoothing parameter is set to formula image.
Figure 6
Figure 6. Comparing normal and segmented scatterplot smoothing.
Top panel shows the raw observations. Middle panel shows straightforward smoothing: no segmentation. Bottom panel shows segmented smoothing: clear segments.
Figure 7
Figure 7. ZEN smoothing of CNV in tumor data (sample GBM139.CEL).
Top panel: cross-validation profile and location of minimum (at broken vertical line). Bottom panel: data and fit, using formula image (double the value indicated by cross-validation, to correct for leaving out half of the data).
Figure 8
Figure 8. ZEN smoothing of log allelic ratio (sample GBM 139.CEL).
The vertical broken lines indicate the segment boundaries, as computed from the adaptive weights in the penalty. The smoothing parameters (formula image) are 0.01 for position and 0.5 for log allelic ratio.
Figure 9
Figure 9. Examples of smoothed CNV and allelic imbance in clinical samples, using ZEN.
First and third row show CNV profiles, second and fourth rows show the matching segmented allelic imbalance plots.
Figure 10
Figure 10. Examples of smoothed CNV in clinical samples, using CNAG software.
Panels show CNV profiles for the samples mentioned in the panel titles. The smoothed signals show unexepected jumps (GBM180) and unclear level overestimations (GBM203-2).
Figure 11
Figure 11. Histograms and estimated normal mixtures for the log allelic ratio.
Estimations are separate for each of the five segments that were derived from the scatterplot smoother in Figure 8.

Similar articles

Cited by

References

    1. Eilers P, DeMenezes R. Quantile smoothing of array cgh data. Bioinformatics. 2005. - PubMed
    1. Eilers P, Goeman J. Enhancing scatterplots with smoothed densities. Bioinformatics. 2004;20:628. - PubMed
    1. Liu Z, Li A, Schulz V, Chen M, Tuck D. Mixhmm: Inferring copy number variation and allelic imbalance using snp arrays and tumor samples mixed with stromal cells. PLoS ONE 5. 2010. - PMC - PubMed
    1. Wang K, Li M, Hadley D, Liu R, Glessner J. Penncnv: an integrated hidden markov model designed for high-resolution copy number variation detection in whole-genome snp genotyping data. Genome Res. 2007;17:1174. - PMC - PubMed
    1. Colella S, Yau C, Taylor J, Mirza G, Butler H. Quantisnp: an objective bayes hiddenmarkov model to detect and accurately map copy number variation using snp genotyping data. Nucleic Acids Res. 2007;35:2025. - PMC - PubMed