Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Aug 18;2(3):lqaa060.
doi: 10.1093/nargab/lqaa060. eCollection 2020 Sep.

Authentication, characterization and contamination detection of cell lines, xenografts and organoids by barcode deep NGS sequencing

Affiliations

Authentication, characterization and contamination detection of cell lines, xenografts and organoids by barcode deep NGS sequencing

Xiaobo Chen et al. NAR Genom Bioinform. .

Abstract

Misidentification and contamination of biobank samples (e.g. cell lines) have plagued biomedical research. Short tandem repeat (STR) and single-nucleotide polymorphism assays are widely used to authenticate biosamples and detect contamination, but with insufficient sensitivity at 5-10% and 3-5%, respectively. Here, we describe a deep NGS-based method with significantly higher sensitivity (≤1%). It can be used to authenticate human and mouse cell lines, xenografts and organoids. It can also reliably identify and quantify contamination of human cell line samples, contaminated with only small amount of other cell samples; detect and quantify species-specific components in human-mouse mixed samples (e.g. xenografts) with 0.1% sensitivity; detect mycoplasma contamination; and infer population structure and gender of human samples. By adopting DNA barcoding technology, we are able to profile 100-200 samples in a single run at per-sample cost comparable to conventional STR assays, providing a truly high-throughput and low-cost assay for building and maintaining high-quality biobanks.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Cell line authentication and sample genetic heterogeneity. (A) Genotype similarities for unrelated/mismatch, identical and closely related cell line pairs. Genotype similarities are calculated for both uncontaminated and contaminated cell lines. Therefore, contaminated cell lines have reduced genotype similarities to their uncontaminated counterparts. The lowest genotype similarity is 91.7% between identical pairs, for an uncontaminated A-875 to a contaminated A-875 with 16.7% of JEG-3. (B) Heterogeneity ratios in 118 uncontaminated cell lines, 220 PDX models and 31 PDXO models. (C) Heterogeneity ratio is positively correlated with mouse ratio in PDX models.
Figure 2.
Figure 2.
The heterogeneity ratio can be used to detect and quantify contamination. (AD) Serial mixes of cell lines MV-4-11 (MV411) and LNCaP clone FGC (LNCAPCLONEFGC) with cell ratios of 5%, 2.5%, 1.25% and 0.625% for the latter; (E) pure LNCaP clone FGC cell line; and (F) pure MV-4-11 cell line. Each tick above the horizontal axis represents an informative SNP site with corresponding SNP heterogeneity ratio. Probability density was estimated by assuming a two/three-component Gaussian mixture. Sample serial number is labeled in the top-right box with the major component cell line in parentheses. Sample heterogeneity ratio is shown underneath.
Figure 3.
Figure 3.
Contamination detection, contaminant inference and contamination ratio estimation. (A) Sample 19R58129 is MV-4-11 mixed with minor contaminating cell line LNCaP clone FGC (LNCAPCLONEFGC). LNCAPCLONEFGC was correctly identified as the contaminant (P-value = 5.01E−17) with a contamination ratio of 1.41%. LNCaP-C4-2 (C42) and LNCAPCLONEFGC were both derived from LNCaP and share high genetic identity (32). In the quantile–quantile plot, each dot is a reference cell line; theoretical and sample quantiles were calculated from a beta distribution fitted to genotype similarities between MV-4-11 and 1055 reference cell lines. The 99% confidence band is shaded. (B) Accuracy of inferring the contaminating second cell line in a cell line under different heterogeneity ratios. A total of 94 cell line samples with known contaminating second cell line were tested; samples were binned by heterogeneity ratio. (C) Cell line ‘G-292 clone A141B1’ had a sample heterogeneity ratio of 7.62% with a distinct right peak in the probability density of SNP heterogeneity ratios, indicating it was contaminated. (D) OCI-AML-2 was inferred as the contaminant (P-value = 1.58E−07) in cell line ‘G-292 clone A141B1’ with a contamination ratio of 6.21%. (E) Near-perfect correlation between estimated and known contamination ratios in simulated cell line mixtures. (F) High correlation between heterogeneity ratios and contamination ratios for cell line samples with known contamination.
Figure 4.
Figure 4.
Estimation of mouse ratio in human–mouse mixtures. (A) Accurate estimation of mouse ratio by the deep NGS sequencing in a serial dilution of human–mouse DNA mixtures with mouse ratios of 90%, 80%, 70%, 50%, 30%, 20%, 10%, 7%, 5% and 0%. (B, C) Mouse ratios estimated in 220 PDX and 31 PDXO models by three approaches, assayed on the same sample for each model. (D) A quadratic relationship between mouse ratios estimated by the deep NGS sequencing and WES in 220 PDX models.
Figure 5.
Figure 5.
Inferred population structures in PDX models. (A) Four hundred twenty-three PDXs derived from East Asian patients. (B) Six hundred thirty-four PDXs derived from Western patients. The three reference populations from the International HapMap Project are CHB, YRI and CEU (41).

Similar articles

Cited by

References

    1. Editorial. Identity crisis. Nature. 2009; 457:935–936. - PubMed
    1. American Type Culture Collection Standards Development Organization Workgroup ASN-0002. Cell line misidentification: the beginning of the end. Nat. Rev. Cancer. 2010; 10:441–448. - PubMed
    1. Capes-Davis A., Reid Y.A., Kline M.C., Storts D.R., Strauss E., Dirks W.G., Drexler H.G., MacLeod R.A., Sykes G., Kohara A. et al. .. Match criteria for human cell line authentication: where do we draw the line. Int. J. Cancer. 2013; 132:2510–2519. - PubMed
    1. Gartler S.M. Apparent HeLa cell contamination of human heteroploid cell lines. Nature. 1968; 217:750–751. - PubMed
    1. Lacroix M. Persistent use of ‘false’ cell lines. Int. J. Cancer. 2008; 122:1–4. - PubMed