Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Apr 7;112(14):4322-7.
doi: 10.1073/pnas.1503587112. Epub 2015 Mar 18.

Immunoglobulin transcript sequence and somatic hypermutation computation from unselected RNA-seq reads in chronic lymphocytic leukemia

Affiliations

Immunoglobulin transcript sequence and somatic hypermutation computation from unselected RNA-seq reads in chronic lymphocytic leukemia

James S Blachly et al. Proc Natl Acad Sci U S A. .

Abstract

Immunoglobulins (Ig) are produced by B lymphocytes as secreted antibodies or as part of the B-cell receptor. There is tremendous diversity of potential Ig transcripts (>1 × 10(12)) as a result of hundreds of germ-line gene segments, random nucleotide incorporation during joining of gene segments into a complete transcript, and the process of somatic hypermutation at individual nucleotides. This recombination and mutation process takes place in the maturing B cell and is responsible for the diversity of potential epitope recognition. Cancers arising from mature B cells are characterized by clonal production of Ig heavy (IGH@) and light chain transcripts, although whether the sequence has undergone somatic hypermutation is dependent on the maturation stage at which the neoplastic clone arose. Chronic lymphocytic leukemia (CLL) is the most common leukemia in adults and arises from a mature B cell with either mutated or unmutated IGH@ transcripts, the latter having worse prognosis and the assessment of which is routinely performed in the clinic. Currently, IGHV mutation status is assessed by Sanger sequencing and comparing the transcript to known germ-line genes. In this paper, we demonstrate that complete IGH@ V-D-J sequences can be computed from unselected RNA-seq reads with results equal or superior to the clinical procedure: in the only discordant case, the clinical transcript was out-of-frame. Therefore, a single RNA-seq assay can simultaneously yield gene expression profile, SNP and mutation information, as well as IGHV mutation status, and may one day be performed as a general test to capture multidimensional clinically relevant data in CLL.

Keywords: B cells; CLL; RNA sequencing; immunoglobulin; somatic hypermutation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Comparison of Ig-ID computational pipeline values and clinical laboratory methods using PCR amplification in the pilot set. Five patients were zero percent mutated by both methods. (A) Scatter plot with identity line, correcting for samples US-1422368 and US-1422309. Dashed lines at 2% represent standardized cutoff for the mutated/unmutated classifier. (B) Bland–Altman plot of the same data with a continuous line of zero difference and dashed lines for the estimated mean difference ± 2 SDs.
Fig. 2.
Fig. 2.
Comparison of Sanger sequencing and Ig-ID. Percent mutation calculation is ideally performed for all nucleotides within the highlighted region. Amplification of clonal transcripts often requires use of framework region primers, with the result that the entire V gene is not amplified and sequenced. The side-effect is that the denominator in the identity calculation is smaller; this may inflate the percent mutation compared with analysis of the full-length transcript. In this case, the reported percent mutation was 10.2%, whereas the Ig-ID calculated percent mutation was 8.5%.
Fig. 3.
Fig. 3.
Comparison of Ig/BCR determination. (A) Sanger sequencing can span from V-J, but difficulties in amplification occasionally require the use of framework region primers rather than leader primers, leading to 5′ incomplete transcripts (Sanger bar interrupted on left). Sanger sequencing typically uses J region reverse primers, leading to incomplete J sequence recovery. Iglesia et al. inferred BCR V genes from mRNA-seq data, but the reconstruction did not span the N-diversity region leading to unknown or incomplete CDR3 sequence. In contrast, the Ig-ID computed transcript spans the entire V-D-J sequence, including junctional diversity regions. (B) Representative Sanger sequence wherein the Sanger sequencing reaction terminated in the J region. FW, framework region. CDR, complementarity determining region.

References

    1. Hamblin TJ, Davis Z, Gardiner A, Oscier DG, Stevenson FK. Unmutated Ig V(H) genes are associated with a more aggressive form of chronic lymphocytic leukemia. Blood. 1999;94(6):1848–1854. - PubMed
    1. Damle RN, et al. Ig V gene mutation status and CD38 expression as novel prognostic indicators in chronic lymphocytic leukemia. Blood. 1999;94(6):1840–1847. - PubMed
    1. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008;5(7):621–628. - PubMed
    1. Wang Z, Gerstein M, Snyder M. RNA-Seq: A revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10(1):57–63. - PMC - PubMed
    1. Logan AC, et al. High-throughput VDJ sequencing for quantification of minimal residual disease in chronic lymphocytic leukemia and immune reconstitution assessment. Proc Natl Acad Sci USA. 2011;108(52):21194–21199. - PMC - PubMed

Publication types

MeSH terms

Substances

Associated data