Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 30;31(1):215.
doi: 10.1186/s10020-025-01238-x.

Molecular detection of hrHPV-induced high-grade squamous intraepithelial lesions of the cervix through a targeted RNA next generation sequencing assay

Affiliations

Molecular detection of hrHPV-induced high-grade squamous intraepithelial lesions of the cervix through a targeted RNA next generation sequencing assay

Julia Faillace Thiesen et al. Mol Med. .

Abstract

Background: Cervical cancer screening programs are increasingly relying on sensitive molecular approaches as primary tests to detect high-risk human papillomaviruses (hrHPV), the causative agents of cervix cancer. Although hrHPV infection is a pre-requisite for the development of most precancerous lesions, the mere detection of viral nucleic acids, also present in transient infections, is not specific of the underlying cellular state, resulting in poor positive predictive values (PPV) regarding lesional states. There is a need to increase the specificity of molecular tests for better stratifying individuals at risk of cancer and to adapt follow-up strategies.

Methods: HPV-RNA-SEQ, a targeted RNA next generation sequencing assay allowing the detection of up to 16 hrHPV splice events and key human transcripts, has previously shown encouraging PPV for the detection of precancerous lesions. Herein, on 302 patients with normal cytology (NILM, n = 118), low-grade (LSIL, n = 104) or high-grade squamous intraepithelial lesions (HSIL, n = 80), machine learning-based model improvement was applied to reach 2-classes (NILM vs HSIL) or 3-classes (NILM, LSIL, HSIL) predictive models.

Results: Linear (elastic net) and nonlinear (random forest) approaches resulted in five 2-class models that detect HSIL vs NILM in a validation set with specificity up to 0.87, well within the range of PPV of other competing RNA-based tests in a screening population.

Conclusions: HPV-RNA-SEQ improves the detection of HSIL lesions and has the potential to complete and eventually replace current molecular approaches as a first-line test. Further performance evaluation remains to be done on larger and prospective cohorts.

Keywords: Human Papillomavirus (HPV); Molecular test; Next-Generation Sequencing (NGS); Precancerous lesions; Screening; Transcriptome.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Cervical smears were collected in the context of routine health care by the Biobanque de Picardie (BRIF N BB-0033–00017) at the CHU Amiens-Picardie. Leftover samples were used secondarily for research purposes. No additional samples were taken for this study. The biobank guarantees that the people from whom the biological samples and data came have been informed of the research and on their right of opposition, access or rectification, and have not expressed their opposition to the reuse of their biological samples and their personal data. The processing of personal data follows the rules of the European General Data Protection Regulation (GDPR). A Data Protection Officer (DPO) has been designated for this research. Information about the study has been published on the Health Data Hub ( https://www.health-data-hub.fr/ ) under reference F20220616163142. Consent for publication: Participants from whom the biological samples and data came have been informed of the research and on their right of opposition, access or rectification, and have not expressed their opposition to the reuse of their biological samples and their personal data. The processing of personal data follows the rules of the European General Data Protection Regulation (GDPR, Regulation (EU) 2016/679). Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Study flow chart. Available clinical results from study cohort. Distribution of clinical outcome according to (1) cytology results, (2) histological results from a biopsy, and (3) histological results based on a conization procedure. Legend: NILM: Negative for Intraepithelial Lesion or Malignancy; ASC-US: Atypical Squamous Cells of Undetermined Significance; LSIL: Low Grade Squamous Intraepithelial Lesion; HSIL: High Grade Squamous Intraepithelial Lesion; SCC: Squamous Cell Carcinoma (Invasive or Microinvasive); ADK: Adenocarcinoma (glandular cell); NC = non-contributive
Fig. 2
Fig. 2
Models overall performances on validation set. 3-class and 2-class models' accuracy are presented for all models trained. Accuracy was computed along with 95% Confidence interval according to prediction on the validation set. Red dotted line represents the average accuracy for a random classifier, computed through simulation (1000 random shuffle of predictions for the validation set). Models for which the accuracy is significantly higher than the no information rate are identified with their significance level: * for pvalue < 0.05; ** for pvalue < 0.01. Set of variables: S: “Spliced”, uS: “Unspliced”, H: “Human”, S + uS: “Spliced + Unspliced”, S + H: “Spliced + Human”, uS + H: “Unspliced + Human”, S + uS + H: “Spliced + Unspliced + Human”, P: “Presence of HPVs”, T:”Total HPV sequence count”
Fig. 3
Fig. 3
Specificity and sensitivity of classifications models. Sensitivity and Specificity were computed on validation set for all 2-class models, along with 95% confidence interval. Thresholds at 0.5 for both metrics are shown in red, and a focus is made on the 5 models that show best performance and compromise between Sp and Se (full dots). Set of variables: S: “Spliced”, uS: “Unspliced”, H: “Human”, S + uS: “Spliced + Unspliced”, S + H: “Spliced + Human”, uS + H: “Unspliced + Human”, S + uS + H: “Spliced + Unspliced + Human”, P: “Presence of HPVs”, T:”Total HPV sequence count”
Fig. 4
Fig. 4
Features predictive value through all 2-class models. Heatmap representing the importance (%) of each feature (rows) used for training models. Three sets (unspliced, spliced and oncogenes) were evaluated through the different trained models: highest importances in prediction are represented in dark blue whereas features that were less decisive in predicting model outcome are shown in off-white. White features were either excluded in the set or removed during feature selection. Methods (columns) are ordered by Hierarchical clustering method, according to Ward D2 criterion. Elastic net method is represented in violet and random forest in green. Set of variables: S: “Spliced”, uS: “Unspliced”, H: “Human”, S + uS: “Spliced + Unspliced”, S + H: “Spliced + Human”, uS + H: “Unspliced + Human”, S + uS + H: “Spliced + Unspliced + Human”
Fig. 5
Fig. 5
Positive predictive value estimates function of HSIL prevalence: Positive predictive value for the five best models is represented along with some other references from literature. PPV was computed as a function of assumed HSIL prevalence in the population (x-axis). In addition, uncertainty bound to the ratio of LSIL relative to HSIL was considered (bands around the line) and this ratio was assumed to be lying between 1 and 4 (Supplementary Data 5). Set of variables: S: “Spliced”, S + uS: “Spliced + Unspliced”, S + H: “Spliced + Human”, uS + H: “Unspliced + Human”, S + uS + H: “Spliced + Unspliced + Human”. Statistical methods: rf: “random forest”, en: elastic net

Similar articles

References

    1. Andralojc KM, Elmelik D, Rasing M, Pater B, Siebers AG, Bekkers R, et al. Targeted RNA next generation sequencing analysis of cervical smears can predict the presence of hrHPV-induced cervical lesions. BMC Med. 2022;20(1):206. - PMC - PubMed
    1. Arbyn M, Simon M, de Sanjosé S, Clarke MA, Poljak M, Rezhake R, et al. Accuracy and effectiveness of HPV mRNA testing in cervical cancer screening: a systematic review and meta-analysis. Lancet Oncol. 2022;23(7):950–60. - PubMed
    1. Arip M, Tan LF, Jayaraj R, Abdullah M, Rajagopal M, Selvaraja M. Exploration of biomarkers for the diagnosis, treatment and prognosis of cervical cancer: a review. Discov Oncol. 2022;13(1):91. 10.1007/s12672-022-00551-9. - PMC - PubMed
    1. Bruno MT, Cassaro N, Mazza G, Guaita A, Boemi S. Spontaneous regression of cervical intraepithelial neoplasia 3 in women with a biopsy—cone interval of greater than 11 weeks. BMC Cancer. 2022;22(1):1072. - PMC - PubMed
    1. Cerasuolo A, Buonaguro L, Buonaguro FM, Tornesello ML. The role of RNA splicing factors in cancer: regulation of viral and human gene expression in human papillomavirus-related cervical cancer. Front Cell Dev Biol. 2020;8. Available from: https://www.frontiersin.org/articles/10.3389/fcell.2020.00474. Cited 2024 Jan 23. - DOI - PMC - PubMed

MeSH terms