Identifying multiple changepoints in heterogeneous binary data with an application to molecular genetics
- PMID: 15475416
- DOI: 10.1093/biostatistics/kxh005
Identifying multiple changepoints in heterogeneous binary data with an application to molecular genetics
Abstract
Identifying changepoints is an important problem in molecular genetics. Our motivating example is from cancer genetics where interest focuses on identifying areas of a chromosome with an increased likelihood of a tumor suppressor gene. Loss of heterozygosity (LOH) is a binary measure of allelic loss in which abrupt changes in LOH frequency along the chromosome may identify boundaries indicative of a region containing a tumor suppressor gene. Our interest was on testing for the presence of multiple changepoints in order to identify regions of increased LOH frequency. A complicating factor is the substantial heterogeneity in LOH frequency across patients, where some patients have a very high LOH frequency while others have a low frequency. We develop a procedure for identifying multiple changepoints in heterogeneous binary data. We propose both approximate and full maximum-likelihood approaches and compare these two approaches with a naive approach in which we ignore the heterogeneity in the binary data. The methodology is used to estimate the pattern in LOH frequency on chromosome 13 in esophageal cancer patients and to isolate an area of inflated LOH frequency on chromosome 13 which may contain a tumor suppressor gene. Using simulations, we show that our approach works well and that it is robust to departures from some key modeling assumptions.
Similar articles
-
Evidence for a familial esophageal cancer susceptibility gene on chromosome 13.Cancer Epidemiol Biomarkers Prev. 2003 Oct;12(10):1112-5. Cancer Epidemiol Biomarkers Prev. 2003. PMID: 14578153
-
A novel region of deletion on 13q33-q34 in esophageal squamous cell carcinoma.Oncol Rep. 2005 Dec;14(6):1639-46. Oncol Rep. 2005. PMID: 16273270
-
Allelic losses at 1p36 and 19q13 in gliomas: correlation with histologic classification, definition of a 150-kb minimal deleted region on 1p36, and evaluation of CAMTA1 as a candidate tumor suppressor gene.Clin Cancer Res. 2005 Feb 1;11(3):1119-28. Clin Cancer Res. 2005. PMID: 15709179
-
Are there any more ovarian tumor suppressor genes? A new perspective using ultra high-resolution copy number and loss of heterozygosity analysis.Genes Chromosomes Cancer. 2009 Oct;48(10):931-42. doi: 10.1002/gcc.20694. Genes Chromosomes Cancer. 2009. PMID: 19603523
-
Loss of heterozygosity analysis: practically and conceptually flawed?Genes Chromosomes Cancer. 2002 Aug;34(4):349-53. doi: 10.1002/gcc.10085. Genes Chromosomes Cancer. 2002. PMID: 12112523 Review.
Cited by
-
Identifying multiple change points in a linear mixed effects model.Stat Med. 2014 Mar 15;33(6):1015-28. doi: 10.1002/sim.5996. Epub 2013 Sep 30. Stat Med. 2014. PMID: 24114935 Free PMC article.
-
Bayesian semiparametric regression for longitudinal binary processes with missing data.Stat Med. 2008 Jul 30;27(17):3247-68. doi: 10.1002/sim.3265. Stat Med. 2008. PMID: 18351709 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources
Medical