Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 May 28:7:e7004.
doi: 10.7717/peerj.7004. eCollection 2019.

Fisher linear discriminant analysis for classification and prediction of genomic susceptibility to stomach and colorectal cancers based on six STR loci in a northern Chinese Han population

Affiliations

Fisher linear discriminant analysis for classification and prediction of genomic susceptibility to stomach and colorectal cancers based on six STR loci in a northern Chinese Han population

Shuhong Hao et al. PeerJ. .

Abstract

Objective: Gastrointestinal cancer is the leading cause of cancer-related death worldwide. The aim of this study was to verify whether the genotype of six short tandem repeat (STR) loci including AR, Bat-25, D5S346, ER1, ER2, and FGA is associated with the risk of gastric cancer (GC) and colorectal cancer (CRC) and to develop a model that allows early diagnosis and prediction of inherited genomic susceptibility to GC and CRC.

Methods: Alleles of six STR loci were determined using the peripheral blood of six colon cancer patients, five rectal cancer patients, eight GC patients, and 30 healthy controls. Fisher linear discriminant analysis (FDA) was used to establish the discriminant formula to distinguish GC and CRC patients from healthy controls. Leave-one-out cross validation and receiver operating characteristic (ROC) curves were used to validate the accuracy of the formula. The relationship between the STR status and immunohistochemical (IHC) and tumor markers was analyzed using multiple correspondence analysis.

Results: D5S346 was confirmed as a GC- and CRC-related STR locus. For the first time, we established a discriminant formula on the basis of the six STR loci, which was used to estimate the risk coefficient of suffering from GC and CRC. The model was statistically significant (Wilks' lambda = 0.471, χ2 = 30.488, df = 13, and p = 0.004). The results of leave-one-out cross validation showed that the sensitivity of the formula was 73.7% and the specificity was 76.7%. The area under the ROC curve (AUC) was 0.926, with a sensitivity of 73.7% and a specificity of 93.3%. The STR status was shown to have a certain relationship with the expression of some IHC markers and the level of some tumor markers.

Conclusions: The results of this study complement clinical diagnostic criteria and present markers for early prediction of GC and CRC. This approach will aid in improving risk awareness of susceptible individuals and contribute to reducing the incidence of GC and CRC by prevention and early detection.

Keywords: Fisher linear discriminant analysis; Gastrointestinal cancer; Genomic susceptibility prediction; Molecular diagnosis; STR.

PubMed Disclaimer

Conflict of interest statement

The authors declare there are no competing interests.

Figures

Figure 1
Figure 1. Diagram of the method of calculation of copy numbers, with the FGA locus as an example.
Figure 2
Figure 2. Example electropherograms of AR (A), Bat-25 (B), D5S346 (C), ER1 (D), ER2 (E), and FGA (F).
The numbers above each peak indicate the allele fragment length. Electropherograms containing one peak (A and B) represent homozygosity, whereas electropherograms containing two peaks (C–F) represent heterozygosity.
Figure 3
Figure 3. Copy number of AR (A), Bat-25 (B), D5S346 (C), ER1 (D), ER2 (E), and FGA (F) between GC and CRC patients and healthy controls.
“-S” represents the short repeat motif of STR alleles; “-L” represents the long repeat motif of STR alleles. GC, gastric cancer; CRC, colorectal cancer. Statistical significance is indicated by asterisks: *, p < 0.05; **, p < 0.01.
Figure 4
Figure 4. ROC curve of Cfunc.
Figure 5
Figure 5. Results of multiple correspondence analysis.
(A) The correlation between the copy numbers of STR loci and IHC markers in GC and CRC. (B) The correlation between the copy numbers of STR loci and tumor markers in GC and CRC. GC, gastric cancer; CRC, colorectal cancer.

Similar articles

Cited by

References

    1. Ackerman CM, Lowe LP, Lee H, Hayes MG, Dyer AR, Metzger BE, Lowe WL, Urbanek M. Ethnic variation in allele distribution of the androgen receptor (AR) (CAG)n repeat. Journal of Andrology. 2012;33:210–215. doi: 10.2164/jandrol.111.013391. - DOI - PMC - PubMed
    1. Bhangu JS, Taghizadeh H, Braunschmid T, Bachleitner-Hofmann T, Mannhalter C. Circulating cell-free DNA in plasma of colorectal cancer patients —a potential biomarker for tumor burden. Surgical Oncology. 2017;26:395–401. doi: 10.1016/j.suronc.2017.08.001. - DOI - PubMed
    1. Du LB, Li HZ, Wang YQ, Zhu C, Zheng RS, Zhang SW, Chen WQ, He J. Report of colorectal cancer incidence and mortality in China, 2013. Zhonghua zhong liu za zhi [Chinese Journal of Oncology] 2017;39:701–706. doi: 10.3760/cma.j.issn.0253-3766.2017.09.012. - DOI - PubMed
    1. Esemuede I, Forslund A, Khan SA, Qin LX, Gimbel MI, Nash GM, Zeng Z, Rosenberg S, Shia J, Barany F, Paty PB. Improved testing for microsatellite instability in colorectal cancer using a simplified 3-marker assay. Annals of Surgical Oncology. 2010;17:3370–3378. doi: 10.1245/s10434-010-1147-4. - DOI - PMC - PubMed
    1. Gettings KB, Aponte RA, Vallone PM, Butler JM. STR allele sequence variation: current knowledge and future issues. Forensic Science International. Genetics. 2015;18:118–130. doi: 10.1016/j.fsigen.2015.06.005. - DOI - PubMed