Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 29;34(2):148-160.
doi: 10.1093/hmg/ddae164.

Uplift of genetic diagnosis of rare respiratory disease using airway epithelium transcriptome analysis

Affiliations

Uplift of genetic diagnosis of rare respiratory disease using airway epithelium transcriptome analysis

Jelmer Legebeke et al. Hum Mol Genet. .

Abstract

Rare genetic respiratory disease has an incidence rate of more than 1:2500 live births in Northern Europe and carries significant disease burden. Early diagnosis improves outcomes, but many individuals remain without a confident genetic diagnosis. Improved and expanded molecular testing methods are required to improve genetic diagnosis rates and thereby improve clinical outcomes. Using primary ciliary dyskinesia (PCD) as an exemplar rare genetic respiratory disease, we developed a standardized method to identify pathogenic variants using whole transcriptome RNA-sequencing (RNA-seq) of nasal epithelial cells cultured at air-liquid interface (ALI). The method was optimized using cells from healthy volunteers, and people with rhino-pulmonary disease but no diagnostic indication of PCD. We validated the method using nasal epithelial cells from PCD patients with known genetic cause. We then assessed the ability of RNA-seq to identify pathogenic variants and the disease mechanism in PCD likely patients but in whom DNA genetic testing was inconclusive. The majority of 49 targeted PCD genes were optimally identified in RNA-seq data from nasal epithelial cells grown for 21 days at ALI culture. Four PCD-likely patients without a previous genetic diagnosis received a confirmed genetic diagnosis from the findings of the RNA-seq data. We demonstrate the clinical potential of RNA-seq of nasal epithelial cells to identify variants in individuals with genetically unsolved PCD. This uplifted genetic diagnosis should improve genetic counselling, enables family cascade screening, opens the door to potential personalised treatment and care approaches. This methodology could be implemented in other rare lung diseases such as cystic fibrosis.

Keywords: RNA-seq; air-liquid-interface culture; primary ciliary dyskinesia; splicing; transcriptome.

PubMed Disclaimer

Conflict of interest statement

GW declares employment and shares in Illumina. All other authors declare no relevant financial or non-financial interests.

Figures

Figure 1
Figure 1
Gene expression profiles of 49 motile cilia genes. (A) The combined expression 49 motile cilia genes in nasal epithelial cell samples obtained from healthy volunteers (HV) and in vitro air-liquid-interface (ALI) cultured with RNA extracted at different time-points. The overall combined gene expression has a strong significant (P-value < 2.2 × 10−16) increase between ALI-culture day 4 (median 0.3 TPM) and day 8 (median 4 TPM). No significant (P-value 1.4 × 10−01) difference was detected between ALI-culture day 14 (median 12 TPM) and day 21 (median 16 TPM). The combined expression slowly decreases for the remaining ALI-culture time-points. This decrease is not significant (ns, P-value > 5.0 × 10−01) between the individual time-points, but it is significant (P-value 3.9 × 10−02) between ALI-culture day 21 (median 16 TPM) and day 63 (median 12 TPM). (B) The combined expression 49 motile cilia genes in nasal epithelial cell samples obtained from healthy volunteers (HV) and non-PCD patients (non-PCD) on in vitro ALI-culture time-points days 14, 21 and 28. The overall combined gene expression increases significantly (P-value 8.5 × 10−04) between ALI-culture day 14 (median 14 TPM) and day 21 (median 20 TPM) in the non-PCD patients. No significant (P-value 2.4 × 10−01) difference was detected between ALI-culture day 21 (median 20 TPM) and day 28 (median 18 TPM). Comparing the combined expression between healthy volunteers (median 12 TPM) and non-PCD patients (median 14 TPM) revealed no significant (P-value 8.1 × 10−02) difference on ALI-culture time-point day 14. While on day 21 the combined expression was significantly (P-value 1.5 × 10−02) higher in the non-PCD patients (median 20 TPM) than the healthy volunteers (median 16 TPM). Similarly, the combined expression was significantly (P-value 5.3 × 10−03) higher in the non-PCD patients (median 18 TPM) than the healthy volunteers (median 14 TPM) on ALI-culture day 28. (A and B) Statistical testing was done with an unpaired Wilcoxon test. Three healthy volunteers and eight non-PCD patients were used for each time-point.
Figure 2
Figure 2
Aberrant splicing event identified in patients E and F. (A) Patient E, the skipped exon event in the patient is supported by reads aligning to DNAH11 exon 6, and splice junctions with 452 and 250 reads, in the non-PCD patient and absence of exon 6 reads, and 183 reads connecting exons 5 and 7, in the patient (arrow). The reported homozygous DNAH11 c.983-1G>T splice acceptor variant is located adjacent to exon 6. (B) Patient F, in the patient 49 splice junction reads support the skipping of CCDC39 exon 6, while 31 and 80 splice junction reads support the inclusion of exon 6, indicating skipping of exon 6 on one allele. This is supported by about half the read coverage for exon 6 compared to the neighboring exons. The reported heterozygous CCDC39 c.664G>T variant was present in exon 6. Furthermore, a mutually exclusive exon usage occurs between exon 6 and exon 7. In the control exon 6 is always included, while the opposite is true for the patient. (C) Patient F, 137 splice junction reads support the canonical splicing and 22 splice junction reads support splicing between an exonic splice donor site and the canonical splice acceptor between exons 3 and 4 of CCDC39. This partial skipping of the canonical splice donor site indicates that this occurs on one allele. (D) The reported heterozygous CCDC39 c.357+1G>C splice donor variant was present adjacent to exon 3 (red top arrow) (11 reads, 82% G and 18% (C). Due to the canonical donor site loss the splicing machinery switches to an exonic splice donor site (black bottom arrow) supported by 22 reads.
Figure 3
Figure 3
Aberrant splicing and downregulation of DNAH11 in patient 1. (A) Differential gene expression results, between the patient and nine non-PCD controls were filtered down with a gene panel consisting out of 49 motile cilia genes. DNAH11 was found to be downregulated in the patient (FDR p-value 1.50 × 10−04, log fold change −1.21). Filtering thresholds used were FDR P-value < 0.05 and log fold change | > 1|. (B) Comparing the mean transcript per million (TPM) for DNAH11 in the patient against the control group revealed a 3-fold lower TPM abundance of DNAH11 transcripts in the patient. (C) The alternative splicing events identified by rMATS in the patient versus nine non-PCD patient controls were visually assessed in IGV. The sashimi plot shows the splicing pattern in the patient (top track) versus a non-PCD patient control (bottom track). Compared to the control in the patient exon 12 is both included (191 reads) and skipped (27 reads), suggesting this occurs on one DNAH11 allele. Skipping of the exon causes a shift in the reading frame and a subsequent premature stop codon (p.Leu658LeufsTer2). A splice acceptor variant (c.1974-3C>T) was found adjacent to exon 12. The sashimi plot also shows the skipping (52 reads) and inclusion (66 reads) of exon 10 in the patient, again suggesting that this occurs on one DNAH11 allele. A nonsense variant (c.1741A>T) was found within exon 10, which introduces a premature stop in the amino acid chain (p.Lys581Ter). Finally, in the patient exon 9 is always included, and exon 10 is either included or skipped. While in the control it is vice versa with exon 9 being either skipped or included, and exon 10 always included.
Figure 4
Figure 4
Aberrant splicing and downregulation of HYDIN identified in patient 2. (A) Differential gene expression results, between the patient and nine non-PCD controls were filtered down with a gene panel consisting out of 49 motile cilia genes. HYDIN expression was found to be significantly different in the patient (FDR p-value 8.08 × 10−03), however, the log fold difference of −0.68 between the patient and the controls was within the filtering threshold. Filtering thresholds used were FDR P-value < 0.05 and log fold change | > 1|. (B) Comparing the mean transcript per million (TPM) for HYDIN in the patient against the control group revealed a 2.5-fold lower TPM abundance of HYDIN transcripts in the patient. (C) The SE event in HYDIN identified by rMATS was visually assessed in IGV. The sashimi plot shows the splicing pattern in the patient (top track) versus a non-PCD patient control (bottom track). Compared to the control in the patient exon 18 is both included and skipped, suggesting this occurs on one HYDIN allele. Skipping of the exon causes no shift in the reading frame (p.Val793_Met843del). No genetic variant was found adjacent or within exon 18. Furthermore, the sashimi plot shows the inclusion of pseudoexon in the patient between exon 18 and exon 19. An intronic cryptic splice acceptor site and polypyrimidine tract were visually detected in front of this pseudoexon. Finally, both the skipping of exon 18 and the pseudoexon also occurs in the patient.
Figure 5
Figure 5
Aberrant splicing and downregulation of HYDIN identified in patient 3. (A) Differential gene expression results, between the patient and nine non-PCD controls were filtered down with a gene panel consisting out of 49 motile cilia genes. RSPH1, CCDC114, and HYDIN expression was found to be significantly different in the patient (respectively FDR p-value 6.05 × 10−04, FDR p-value 1.96 × 10−02, FDR p-value 2.36 × 10−02), however, the log fold difference (respectively −0.74, −0.56, and −0.57) were within the filtering threshold. Filtering thresholds used were FDR P-value < 0.05 and log fold change | > 1|. (B) Comparing the mean transcript per million (TPM) for RSPH1, CCDC114, and HYDIN in the patient against the control group revealed a respectively 2.3-fold, 2.6-fold, and 2.3-fold lower TPM abundance of these transcripts in the patient. (C) The SE event identified by rMATS in the patient was visually assessed in IGV. The sashimi plot shows the splicing pattern in the patient (top track) versus a non-PCD patient control (bottom track). Compared to the control in the patient exon 25 is both included and skipped, suggesting this occurs on one HYDIN allele, and this SE event was only identified by rMATS. Skipping of the exon causes a shift in the reading frame, and a premature stop codon (p. Lys1262LysfsTer3). A G to T was found adjacent to exon 25 involving the splice acceptor site (c.3786-1G>T). Furthermore, the sashimi plot shows that in the patient exon 27 is both included and skipped. Skipping of this exon does not results in a reading frame shift (p.Val1329_Gln1398del). No genetic variant was detected within or adjacent to exon 27, however, subsequent whole genome sequencing identified a deletion (chr16:g.70987855_70987987del) spanning the splice site of exon 27.
Figure 6
Figure 6
Aberrant splicing and downregulation of CCDC40 identified in patient 4. (A) Differential gene expression results, between the patient and nine non-PCD controls were filtered with a gene panel consisting out of 49 motile cilia genes. CCDC40 expression was found to be significantly different in the patient (FDR p-value 2.06 × 10−43 and log fold change −3.05). Filtering thresholds used were FDR P-value < 0.05 and log fold change | > 1|. (B) Comparing the mean transcript per million (TPM) for CCDC40 in the patient against the control group revealed a 13-fold lower TPM abundance of CCDC40 transcripts in the patient. (C) The SE event identified by rMATS in the patient was visually assessed in IGV. The sashimi plot shows the splicing pattern in the patient (top track) versus a non-PCD patient control (bottom track). Compared to the control in the patient both the inclusion of a pseudoexon between exon 9 and exon 10, and a normal splicing pattern between exon 9 and exon 10 was observed, suggesting this occurs on one CCDC40 allele. Inclusion of this pseudoexon causes a shift in the reading frame, and a premature stop codon (p.Ser252ArgfsTer43). A cryptic intronic splice acceptor site (5′-ttttagGTT-3′) and an intronic splice donor site (5′-CAGgtgag-3′, bold font indicating a patient specific SNP) were detected adjacent to the pseudoexon. The intronic splice donor site is created due to a patient specific nucleotide change being c.1441-919G>A (rs1037010068).

References

    1. Lucas JS, Davis SD, Omran H. et al. Primary ciliary dyskinesia in the genomics age. Lancet Respir Med 2020;8:202–216. - PubMed
    1. Wallmeier J, Nielsen KG, Kuehni CE. et al. Motile ciliopathies. Nat Rev Dis Primer 2020;6:1–29. - PubMed
    1. Hannah WB, Seifert BA, Truty R. et al. The global prevalence and ethnic heterogeneity of primary ciliary dyskinesia gene variants: a genetic database analysis. Lancet Respir Med 2022;10:459–468. - PMC - PubMed
    1. Lucas JS, Barbato A, Collins SA. et al. European Respiratory Society guidelines for the diagnosis of primary ciliary dyskinesia. Eur Respir J 2017;49:1601090. - PMC - PubMed
    1. Shapiro AJ, Davis SD, Polineni D. et al. Diagnosis of primary ciliary dyskinesia. An official American Thoracic Society clinical practice guideline. Am J Respir Crit Care Med 2018;197:e24–e39. - PMC - PubMed

LinkOut - more resources