Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 18:14:1167235.
doi: 10.3389/fimmu.2023.1167235. eCollection 2023.

Complete variable domain sequences of monoclonal antibody light chains identified from untargeted RNA sequencing data

Affiliations

Complete variable domain sequences of monoclonal antibody light chains identified from untargeted RNA sequencing data

Allison Nau et al. Front Immunol. .

Abstract

Introduction: Monoclonal antibody light chain proteins secreted by clonal plasma cells cause tissue damage due to amyloid deposition and other mechanisms. The unique protein sequence associated with each case contributes to the diversity of clinical features observed in patients. Extensive work has characterized many light chains associated with multiple myeloma, light chain amyloidosis and other disorders, which we have collected in the publicly accessible database, AL-Base. However, light chain sequence diversity makes it difficult to determine the contribution of specific amino acid changes to pathology. Sequences of light chains associated with multiple myeloma provide a useful comparison to study mechanisms of light chain aggregation, but relatively few monoclonal sequences have been determined. Therefore, we sought to identify complete light chain sequences from existing high throughput sequencing data.

Methods: We developed a computational approach using the MiXCR suite of tools to extract complete rearranged IGVL-IGJL sequences from untargeted RNA sequencing data. This method was applied to whole-transcriptome RNA sequencing data from 766 newly diagnosed patients in the Multiple Myeloma Research Foundation CoMMpass study.

Results: Monoclonal IGVL-IGJL sequences were defined as those where >50% of assigned IGK or IGL reads from each sample mapped to a unique sequence. Clonal light chain sequences were identified in 705/766 samples from the CoMMpass study. Of these, 685 sequences covered the complete IGVL-IGJL region. The identity of the assigned sequences is consistent with their associated clinical data and with partial sequences previously determined from the same cohort of samples. Sequences have been deposited in AL-Base.

Discussion: Our method allows routine identification of clonal antibody sequences from RNA sequencing data collected for gene expression studies. The sequences identified represent, to our knowledge, the largest collection of multiple myeloma-associated light chains reported to date. This work substantially increases the number of monoclonal light chains known to be associated with non-amyloid plasma cell disorders and will facilitate studies of light chain pathology.

Keywords: AL amyloidosis; MiXCR; antibody light chain; antibody repertoire sequencing; antibody sequence; monoclonal gammopathy; multiple myeloma; plasma cell dyscrasia.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Identification of clonal IGVL-IGJL sequences from untargeted RNAseq data. (A) Schematic depiction of IGVL-IGJL sequence determination methods. Following optional enrichment of CD138+ plasma cells, total mRNA is extracted and cDNA synthesized by reverse transcription. Standard IGVL-IGJL cloning methods (blue boxes) use specific primers to amplify coding regions, followed by Sanger sequencing and validation by PCR, or, more recently, by high throughput sequencing approaches. The method described here (yellow boxes) takes deep sequencing datasets acquired for gene expression studies and uses the MiXCR suite of tools to identify clonal IGVL-IGJL sequences. (B) Computational analysis of RNAseq data to identify complete IGVL-IGJL sequences, using software tools described in the Methods. The steps shown in yellow boxes are automatic and require only the SRA accession as an input; the output from each step is passed to the next program. Downstream analysis and deposition in AL-Base, shown in orange, requires manual oversight.
Figure 2
Figure 2
Accurate recovery of U266 IGVL-IGJL sequence. (A) RNAseq reads aligned by MiXCR to immunoglobulin loci. (B) Alignment between U266 IGLV2-8-IGLJ2-IGLC2 sequences derived from untargeted RNAseq data (54) using MiXCR and standard cloning methods (55). Identical regions are highlighted in grey. The regions of the monoclonal sequences are shown with yellow, green and blue bars.
Figure 3
Figure 3
MiXCR identifies IGVL -IGJL clones from an input of five million randomly sampled reads. For each of three CoMMpass cases, random samples of reads (three replicates each of seven sample sizes) were used as the input to MiXCR and the resulting clonal sequences were analyzed. Dashed lines show the results using 5M reads, which was chosen as the target for down-sampling. (A) Computation time for varying amounts of input reads, using four cores on an Intel Xeon processor. (B) Length of output top IGVL-IGJL clone for varying amount of input reads. (For comparison, a typical IGVL-IGJL sequence is approximately 330 nt.) For cases where the output from MiXCR comprised multiple non-contiguous segments for a single clone, only the longest segment was considered. (C) Fraction of counts assigned by MiXCR to the major IGVL-IGJL clone.
Figure 4
Figure 4
Clonal IGVL-IGJL properties among 766 clinical samples are independent of the number of mapped reads. Fraction (A) and length (B) of the top IGVL-IGJL clone are plotted against the total number of reads aligned to the IGK and IGL loci by MiXCR. Samples where the most frequent clone is derived from IGKV or IGLV are shown as circles and crosses, respectively. The category to which each clone was assigned is shown by the color of the symbol. For cases where the output from MiXCR comprised multiple non-contiguous segments for a single clonotype, only the longest segment was considered.
Figure 5
Figure 5
Locus and category assignment for 766 CoMMpass samples. (A) Sequences assigned to each category. Category colors are the same as in Figure 4 . (B) Example Category 2 alignment comparing the top two clones identified by MiXCR within a single case (MMRF126178). The two clones are identical over a 254 nt region which includes CDR3, and were therefore collapsed to yield a single sequence. The regions corresponding to the precursor germline genes are shown with green and blue bars.
Figure 6
Figure 6
Sequence coverage for 765 IGVL-IGJL clones. Each horizontal line represents a single clonotype sequence determined by MiXCR. Sequences are aligned according to the start of the IGVL and IGCL regions, which were identified by alignment to IMGT reference sequences. Gene regions are indicated with dashed lines. The first nucleotides of the IGKV/IGLV and IGKC/IGLC genes are used as reference points. Gaps represent regions of missing sequence; the differences in IGVL-IGJL length between different germline genes are not shown. Colors represent the category to which each sequence is assigned, as for Figures 4 , 5 .
Figure 7
Figure 7
Diverse IGVL-IGJL sequences despite identical CDR3 regions. Protein sequence alignments of four groups of clonal sequences where the CDR3 sequence is identical between clones. The IGVL and IGJL genes identified for all sequences within each group were identical, and the inferred protein sequences for these germline precursors are shown beneath the clonal sequences. Shaded residues highlight differences between sequences. Orange boxes indicate CDR regions, according to the IMGT classification. Numbers represent local position within the alignments.
Figure 8
Figure 8
IGVL-IGJL clones identified by MiXCR are consistent with clinical free LC ratios and previously determined sequences. (A) Workflow showing the number of samples available for comparison. Matching sequences are those where the IGK or IGL locus assigned by MiXCR is the same as that indicated by the clinical data. Only the identity of LC determined by immunofixation is shown for the M-protein data; of 313 samples with available M-protein results, 253 had complete immunoglobulin (data not shown). (B) Comparison of the κ to λ serum FLC ratio calculated from the CoMMpass clinical data with the identity of the clonal LC identified by MiXCR. IGKV and IGLV clones are shown as blue circles and pink crosses, respectively. Dashed lines demark the boundaries of normal κ to λ serum ratios. If the κ to λ serum ratio is >1.65 or <0.26, we would expect the most frequent MiXCR clone to be derived from IGKV or IGLV, respectively. (C) CDR3 sequences from Categories 1-3 determined in this work are identical to those previously determined by Rustad et al. (23). (D) IGVL-IGJL sequences from Categories 1-3 determined in this work are identical to those previously determined by Langerhorst et al. (24). Only a representative example of sequences determined by Langerhorst et al. was available for comparison.

Similar articles

Cited by

References

    1. Merlini G, Dispenzieri A, Sanchorawala V, Schönland SO, Palladini G, Hawkins PN, et al. . Systemic immunoglobulin light chain amyloidosis. Nat Rev Dis Primers (2018) 4:38. doi: 10.1038/s41572-018-0034-3 - DOI - PubMed
    1. Kumar SK, Rajkumar SV. The multiple myelomas - current concepts in cytogenetic classification and therapy. Nat Rev Clin Oncol (2018) 15:409–21. doi: 10.1038/s41571-018-0018-y - DOI - PubMed
    1. Fermand J-P, Bridoux F, Dispenzieri A, Jaccard A, Kyle RA, Leung N, et al. . Monoclonal gammopathy of clinical significance: a novel concept with therapeutic implications. Blood (2018) 132:1478–85. doi: 10.1182/blood-2018-04-839480 - DOI - PubMed
    1. Leung N, Bridoux F, Nasr SH. Monoclonal gammopathy of renal significance. N Engl J Med (2021) 384:1931–41. doi: 10.1056/NEJMra1810907 - DOI - PubMed
    1. Buxbaum J. Mechanisms of disease: monoclonal immunoglobulin deposition. amyloidosis, light chain deposition disease, and light and heavy chain deposition disease. Hematol Oncol Clin North Am (1992) 6:323–46. doi: 10.1016/S0889-8588(18)30347-2 - DOI - PubMed

Publication types