Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jan 1:2:11-18.
doi: 10.4137/bii.s2222.

Avoiding Pitfalls in the Statistical Analysis of Heterogeneous Tumors

Affiliations

Avoiding Pitfalls in the Statistical Analysis of Heterogeneous Tumors

David E Axelrod et al. Biomed Inform Insights. .

Abstract

Information about tumors is usually obtained from a single assessment of a tumor sample, performed at some point in the course of the development and progression of the tumor, with patient characteristics being surrogates for natural history context. Differences between cells within individual tumors (intratumor heterogeneity) and between tumors of different patients (intertumor heterogeneity) may mean that a small sample is not representative of the tumor as a whole, particularly for solid tumors which are the focus of this paper. This issue is of increasing importance as high-throughput technologies generate large multi-feature data sets in the areas of genomics, proteomics, and image analysis. Three potential pitfalls in statistical analysis are discussed (sampling, cut-points, and validation) and suggestions are made about how to avoid these pitfalls.

PubMed Disclaimer

Conflict of interest statement

The authors report no conflicts of interest.

Figures

Figure 1
Figure 1
Cut-points of a continuous distribution of breast cancer patients. Eighty patients with in situ duct carcinoma of the breast were ranked by a canonical variable derived by weighting 39 nuclear features of 200 cells per patient. Left panel: there is a continuous distribution of patients. Middle panel: one cut-point at the mean separates patients into two groups (Low and High). Right panel: two cut-points separate patients into three groups (Low, Intermediate, and High). The choice of the number of cut-points, and the value of the cut-points, produces different groups of patients. Such discrete groups of patients are useful for comparing the outcome of the groups of patients, for instance by Kaplan-Meier survival analysis. However, the discrete groups are not distinct biological classes recognized from the distribution of their canonical variable of nuclear features.

References

    1. Chapman JW, Wolman E, Wolman SR, et al. Assessing genetic markers of tumor progression in the context of intratumor heterogeneity. Cytometry. 1998;31:67–73. - PubMed
    1. McShane LM, Altman DG, Sauerbrei W, et al. Reporting recommendations for tumor marker prognostic studies (REMARK) J Natl Cancer Inst. 2005;97:1180–4. - PubMed
    1. Boveri T. In: The Origin of Malignant Tumors. Boveri M, translator. Baltimore: Williams & Wilkins; 1929.
    1. Leith JT, Dexter DL. Mammalian Tumor Cell Heterogeneity. Boca Raton, FL: CRC Press, Inc.; 1986.
    1. Axelrod DE, Gusev Y, Gamel JW. Ras oncogene-transformed and non-transformed cell populations are each heterogeneous but respond differently to the chemotherapeutic drug cytosine arabinoside (Ara-C) Cancer Chemother Pharmacol. 1997;39:445–451. - PubMed

LinkOut - more resources