Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul-Aug;20(4):603-12.
doi: 10.1136/amiajnl-2012-001574. Epub 2013 Feb 26.

Role of genetic heterogeneity and epistasis in bladder cancer susceptibility and outcome: a learning classifier system approach

Affiliations

Role of genetic heterogeneity and epistasis in bladder cancer susceptibility and outcome: a learning classifier system approach

Ryan John Urbanowicz et al. J Am Med Inform Assoc. 2013 Jul-Aug.

Abstract

Background and objective: Detecting complex patterns of association between genetic or environmental risk factors and disease risk has become an important target for epidemiological research. In particular, strategies that provide multifactor interactions or heterogeneous patterns of association can offer new insights into association studies for which traditional analytic tools have had limited success.

Materials and methods: To concurrently examine these phenomena, previous work has successfully considered the application of learning classifier systems (LCSs), a flexible class of evolutionary algorithms that distributes learned associations over a population of rules. Subsequent work dealt with the inherent problems of knowledge discovery and interpretation within these algorithms, allowing for the characterization of heterogeneous patterns of association. Whereas these previous advancements were evaluated using complex simulation studies, this study applied these collective works to a 'real-world' genetic epidemiology study of bladder cancer susceptibility.

Results and discussion: We replicated the identification of previously characterized factors that modify bladder cancer risk--namely, single nucleotide polymorphisms from a DNA repair gene, and smoking. Furthermore, we identified potentially heterogeneous groups of subjects characterized by distinct patterns of association. Cox proportional hazard models comparing clinical outcome variables between the cases of the two largest groups yielded a significant, meaningful difference in survival time in years (survivorship). A marginally significant difference in recurrence time was also noted. These results support the hypothesis that an LCS approach can offer greater insight into complex patterns of association.

Conclusions: This methodology appears to be well suited to the dissection of disease heterogeneity, a key component in the advancement of personalized medicine.

Keywords: Bladder cancer; Epistasis; Heterogeneity; Learning Classifier System; Smoking; XPD.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Outline of the AF-UCS (attribute feedback-sUpervised Classifier System) algorithm. (1) Learning occurs iteratively, focusing on a single training instance from the dataset at a time. (2) Training instance passed to the population of rules (P). (3) Match set (M) is formed, including any rule in (P) that that has a condition matching the attribute states of the instance. (4) Correct set (C) is formed, including any rule in (M) which specifies the correct class of the instance. (5) If no rules are found for (C), randomly generate such a rule using the covering mechanism. (6) Update rule parameters in (M) and (C) (eg, rule fitness). (7) Use rules in (C) to update attribute tracking scores for current instance. (8) The genetic algorithm (GA) selects parent rules from (C) based on fitness and generates offspring rules which are added to (P). If attribute feedback is being used, the attribute tracking scores for the current instance are applied as weights to guide the GA. (9) Deletion mechanism removes rules from (P) based on fitness whenever the size of (P) is greater than the user-specified maximum population size.
Figure 2
Figure 2
Rule population visualizations. (A) Heat-map visualization of the evolved AF-UCS (attribute feedback-sUpervised Classifier System) rule population. Each row in the heat-map is 1 of 1000 rules comprising the population. Each column is one of the 10 attributes. Yellow indicates specification of a respective attribute within a rule, while blue indicates generalization (ie, ‘#’/‘don't care’. The attribute ‘male’ refers to gender. (B) Illustrates the co-occurrence network, appearing as a fully connected network before any filtering is applied. The diameter of a node is the SpS for that attribute, edges represent co-occurrence, and the thickness of an edge is the respective CoS. (C) The network after filtering out all CoSs that did not meet the significance cut-off point. CoS, co-occurrence statistic; SpS, specificity sum.
Figure 3
Figure 3
Subgroup identification and analysis. (A) Heat-map of normalized AF-UCS (attribute feedback-sUpervised Classifier System) attribute tracking scores for entire bladder cancer dataset (three significant attributes). Each row in the heat-map is one of 914 instances comprising the dataset. Each column is one of three attributes. Yellow indicates higher normalized tracking scores, while blue indicates lower ones. Significant subject clusters are delineated by the blocks on the y axis labeled alphabetically. Owing to their small size, clusters F and G are not labeled, but can be seen between clusters A and B. Cluster G is adjacent to B, while cluster F is adjacent to A. In order to better highlight the attribute patterns underlying these clusters, the normalized attribute tracking scores are further scaled by instance using the scale feature in pvclust. (B–D) Kaplan–Meier plots comparing different clinical variables for clusters B and D. Plus signs in the curve indicate censoring.

Similar articles

Cited by

References

    1. Donnelly P. Progress and challenges in genome-wide association studies in humans. Nature 2008;456:728–31 - PubMed
    1. Manolio T, Collins F, Cox Net al. Finding the missing heritability of complex diseases. Nature 2009;461:747–53 - PMC - PubMed
    1. Kraft P, Zeggini E, Ioannidis J. Replication in genome-wide association studies . Stat Sci: A Rev J Inst Math Stat 2009;24:561 - PMC - PubMed
    1. Greene C, Penrod N, Williams S, et al. Failure to replicate a genetic association may provide important clues about genetic architecture. PLoS One 2009;4:e5639. - PMC - PubMed
    1. Maher B. Personal genomes: the case of the missing heritability. Nature 2008;456:18. - PubMed

Publication types