Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun;12(6):2950-2962.
doi: 10.1016/j.apsb.2021.11.021. Epub 2021 Dec 2.

Prediction of lipid nanoparticles for mRNA vaccines by the machine learning algorithm

Affiliations

Prediction of lipid nanoparticles for mRNA vaccines by the machine learning algorithm

Wei Wang et al. Acta Pharm Sin B. 2022 Jun.

Abstract

Lipid nanoparticle (LNP) is commonly used to deliver mRNA vaccines. Currently, LNP optimization primarily relies on screening ionizable lipids by traditional experiments which consumes intensive cost and time. Current study attempts to apply computational methods to accelerate the LNP development for mRNA vaccines. Firstly, 325 data samples of mRNA vaccine LNP formulations with IgG titer were collected. The machine learning algorithm, lightGBM, was used to build a prediction model with good performance (R 2 > 0.87). More importantly, the critical substructures of ionizable lipids in LNPs were identified by the algorithm, which well agreed with published results. The animal experimental results showed that LNP using DLin-MC3-DMA (MC3) as ionizable lipid with an N/P ratio at 6:1 induced higher efficiency in mice than LNP with SM-102, which was consistent with the model prediction. Molecular dynamic modeling further investigated the molecular mechanism of LNPs used in the experiment. The result showed that the lipid molecules aggregated to form LNPs, and mRNA molecules twined around the LNPs. In summary, the machine learning predictive model for LNP-based mRNA vaccines was first developed, validated by experiments, and further integrated with molecular modeling. The prediction model can be used for virtual screening of LNP formulations in the future.

Keywords: Formulation prediction; Ionizable lipid; LightGBM; Lipid nanoparticle; Machine learning; Molecular modeling; Vaccine; mRNA.

PubMed Disclaimer

Figures

Image 1
Graphical abstract
Figure 1
Figure 1
Data collecting and cleaning process for machine learning (ML) work. (A) Data collecting and cleaning process. (B) The eventual dataset contained lipid nanoparticle (LNP) with seven kinds of ionizable lipids, including DLin-MC3-DMA (MC3), DLinDMA, L319, Lipid M, N, and Q, and SM-102,.
Figure 2
Figure 2
Three-dimensional structure of the MC3, SM-102, DSPC, cholesterol, and PEG2000-DMG.
Figure 3
Figure 3
Data distribution of 325 formulation datasets. Numerical counts of the eventual data dependent on disease and protein (A), subject type (B), population or strain (C), injection route (D), ionizable lipid type (E). In (A), H1N1 Cal and PR8 referred to strains A/California/07/2009 and A/Puerto Rico/8/1934, respectively; SARS-CoV-2 S-2P and RBD referred to the S protein with two substitutions of proline at 986 and 987 amino acid positions and receptor binding domain (RBD), respectively; and RSV mDS-Cav-1 referred to the full-length F protein respiratory syncytial virus (RSV) with four-point mutations,.
Figure 4
Figure 4
Data distribution of 325 formulation datasets. Numerical counts of the eventual data dependent on N/P ratio (A), log10(dose) (B), the second vaccination time (C), IgG titer test time (D), and log10(IgG titer) results (E) were given.
Figure 5
Figure 5
Features ranking and important substructure of ionizable lipids. (A) The top 25 important features related to the formulation. Importance times were recognized using the information gain (IG) values as a criterion from the lightGBM model. (B) The top 18 important IL-ECFP and their corresponding specific substructure of ionizable lipid. The center atom, recognizing length, and environmental information of each ECFP are indicated by the stressed blue area, black bonds, and grey bonds, respectively.
Figure 6
Figure 6
Comparison between ML prediction and in vivo expression level. (A) Predicted log10(IgG titer) versus time profile of BALB/c mice induced by mRNA-LNP encoding S-2P protein of SARS-CoV2 at the dose of 20 μg by i.m. administration on Days 0 and 21. LNP consists of ionizable lipid, DSPC, cholesterol, and PEG-lipid at a molar ratio of 50:10:38.5:1.5. Ionizable lipids included MC3 and SM-102. The N/P ratio is 6:1 or 3:1. (B) Relative light unit (RLU) of HiBit tag versus time profiles in C57BL/6JGPt mice induced by mRNA-LNP encoding angiotensin-converting enzyme 2 (ACE2) following i.v. administration. The LNP formulations were the same as the prediction task. The difference in the maximum RLU at 8 h (C) and the AUC at 168 h (D) after administration were tested. Data are presented as mean ± SD (n = 4). ∗∗P ≤ 0.005. ns, not significant.
Figure 7
Figure 7
The snapshots of mRNA structure at the initial time (A) and 100 ns of simulation (B).
Figure 8
Figure 8
The snapshots of four lipid systems for 200 ns MD simulation: (A) SM102-3:1; (B) SM102-6:1; (C) MC3-3:1; (D) MC3-6:1; water molecules were not displayed in the figure. Red represents mRNA; purple represents SM-102 ionizable lipid; blue represents MC3 ionizable lipid; yellow represents cholesterol; cyan represents DSPC; green represents PEG2000-DMG.
Figure 9
Figure 9
The snapshots of four lipid systems for 200 ns MD simulation: (A) SM102-3:1; (B) SM102-6:1; (C) MC3-3:1; (D) MC3-6:1; water molecules were not displayed in the figure. Yellow: mRNA sequence. Blue: nitrogen on the ionizable lipids.
Figure 10
Figure 10
Quantitative analysis of four lipid systems during 100 ns MD simulation. (A) Root mean square displacement (RMSD) vs. time. (B) Solvent accessible surface area of the mRNA sequence vs. time. (C) Mass-weighted radius of gyration (Rg) vs. time. (D) Density profile of a system as a function of the distance from the geometric center of the system.
Figure 11
Figure 11
The evolution of lipids fusion and theoretical structure of mRNA LNP system. (A) At the initial mixing stage, lipids form many small clusters and attach along the mRNA sequence by electrostatic effect. (B) The clusters getting close tend to fuse into a bigger cluster to decrease the surface energy. The tails of lipid in these clusters are reduced for clarity. Then, more clusters participate in the fusion to form a long lipid particle (C) or liposome-like particle (D). If the fusion primarily results in long lipid particles, they should form tube structures in the core of LNP (E). Otherwise, lipid fusion leading to liposome-like particles produces LNP containing a large chamber filled with aqueous phase (F). The DSPC and PEG should locate at the exterior of LNP while cholesterols insert in the interval between lipids.

References

    1. Prüβ B.M. Current state of the first COVID-19 vaccines. Vaccines. 2021;9:30. - PMC - PubMed
    1. Mahase E. COVID-19: UK approves Pfizer and BioNTech vaccine with rollout due to start next week. BMJ. 2020;371:m4714. - PubMed
    1. Tanne J.H. COVID-19: Pfizer–BioNTech vaccine is rolled out in US. BMJ. 2020;371:m4836. - PubMed
    1. Polack F.P., Thomas S.J., Kitchin N., Absalon J., Gurtman A., Lockhart S., et al. Safety and efficacy of the BNT162b2 mRNA COVID-19 vaccine. N Engl J Med. 2020;383:2603–2615. - PMC - PubMed
    1. Baden L.R., Sahly H.M.E., Essink B., Kotloff K., Frey S., Novak R., et al. Efficacy and safety of the mRNA-1273 SARS-CoV-2 vaccine. N Engl J Med. 2021;384:403–416. - PMC - PubMed