Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 17;17(1):53.
doi: 10.1186/s13321-025-00977-7.

The pucke.rs toolkit to facilitate sampling the conformational space of biomolecular monomers

Affiliations

The pucke.rs toolkit to facilitate sampling the conformational space of biomolecular monomers

Jérôme Rihon et al. J Cheminform. .

Abstract

Understanding of the structural and dynamic behaviour of molecules is a major objective in molecular modeling research. Sampling through the torsional space is an efficient way to map their behaviour. However, generating a landscape of possible conformations relies on multiple formalisms whose mathematics are often difficult to convert to code. Here we present a command line tool and a scripting module to provide the means to generate such landscapes with different axes according to various formalisms exploited for conformational sampling. Additionally to this toolkit, we apply a benchmarking study on subjecting a DNA nucleoside to a diverse set of quantum mechanical levels of theory for geometry optimisations and energy potential calculations. The potential of the tool is demonstrated on examples including amino acids and synthetic nucleosides having five-membered or six-membered sugar moieties.

Keywords: Computational chemistry; Linear algebra; Puckering formalisms; Python; Rust.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing interests: The research group declares no conflicts of interest. The https://github.com/jrihon/puckers and the https://github.com/jrihon/puckepy are available on GitHub, where their documentation can be found as well. Installation procedures are given in the respective repositories, available for all major operating systems. Both tools are written in Rust, with pucke.py available as a Python library, and function under the MIT license. The Rust toolchain (cargo) solves dependencies, no Python dependencies required.

Figures

Fig. 1
Fig. 1
Explanation of conformational sampling experiments. A. Sampling over peptide-like systems, akin to the concept of a Ramachandran-plot. B. Sampling the five-membered ring space, employing the Altona-Sundaralingam/Sato formalism (Equation 1) [6]. C. Sampling over the six-membered ring space, reverse engineering spherical coordinates to local elevation. Local elevation is returned through Equation 5, after which the atomic positions are computed for and the SP improper dihedrals are calculated
Fig. 2
Fig. 2
Examples of conformations produced by querying for specific Cremer-Pople coordinates, for five- and six-membered ring systems. This employs the inversion algorithm, detailed by Cremer [38]. The library produces just the base ring, hydrogens and nitrogen (representing the nucleobase) were added with UCSF Chimera [42]. A. Various conformations of the DNA nucleoside, produced by inverting CP-coordinates for five-membered rings (CP5). B. Various conformations of the Hexitol NA nucleoside, produced by inverting CP-coordinates for six-membered rings (CP6). Figure 12 details how to compute for the inversion
Fig. 3
Fig. 3
Conformational sampling of the DNA Adenosine nucleoside, carried out at the various levels of theory. Geometry optimisations (brown; columns) were carried out at the HF-3c, PBEOQ, HFQ and MP2Q level and were subjected to four additional Single Point Evaluation each (mauve; rows). A 4x4 matrix shows the result of all the samplings, visualising the respective combinations of GO and SPE. To clarify, combinations involving the same LoT did not require an additional SPE (shown on the diagonal of the matrix). The GSQ has highlighted borders (lower right)
Fig. 4
Fig. 4
Comparison of the optimised conformers at the HF-3c, PBEOQ and HFQ level, compared to the GSQ output. A. Comparison of the difference in relative energy (ΔE) between the GSQ (MP2Q) CS experiments and the other GO sampling that have been subjected to an SPE at the MP2Q level. Ranges from [-0.5 0.5] kcalmol B. Comparison of RMSD between the GSQ (MP2Q) GO procedures of the experiment and the other GO sampling. Ranges from [0. 0.10] Å
Fig. 5
Fig. 5
Differences AS vs. CP formalism on five-membered rings. A. MP2Q-optimised PES, shown as contour plot (upper), in contrast with the scatter plot (lower) that was used to interpolate the data from. B. Altona-Sundaralingam representation on a polar coordinate plot of the scatterplot in (A). C. Cremer-Pople representation on a polar coordinate plot of the scatterplot in (A). D. From Cornell et al. [8], the deoxyribose adenosine conformations highlighted along their denoted conformation, according to the manuscript itself. For (2E) in turquois, (EO) in green, (OE) in lila and (3E) in orange. The conformation, as a result of the CS experiment itself (Figure 3), nearest to the respective conformations from the Cornell paper are also highlighted as the base colour. The conformations are highlighted in Figure 13, by making use of the inversion methods in pucke.py
Fig. 6
Fig. 6
Potential energy surface of A. peptide-like systems, B. six-membered ring systems. The six-membered ring HNA molecule is depicted in Figure 2. Explanation of the geographical vs. CP formalism on the axes ranges involved in Figure 14
Fig. 7
Fig. 7
Frame (A) contains the commandline query from the pucke.rs tool. Frames (B-E) are examples of how the pucke.py module can be used in code. A. Commandline query through the pucke.rs tool. B. Calculating the puckering coordinates of the desired formalism, by prompting atomnames or indices from the molecular data. C. A show of how to invert specific Cremer-Pople coordinate into a molecular structure. D. Query of the fivering constraints and fivering axes through the pucke.py module. The module obviously allows sampling of peptide and sixring space (not shown here). E. A show of using the geometry module, here specifically the dihedral function, and calculating structural data of interest on molecules. The puckepy module specifies manipulating data on monomers, but allows manipulating data on full polymeric structures (not shown, see online documentation)
Fig. 8
Fig. 8
A. The RAM usage and B. Disk Space usage during the geometry optimisation protocol of the set of the LoTs, as mentioned in the Methods section. The MP2Q surpasses the computational requirements of the second most demanded PBEOQ by large, requiring almost ten fold more time and four fold more virtual resources to complete. C. The RAM usage and D. Disk Space usage during the single point evaluation protocol. The HFQ seems to be on par with the RAM usage to that of MP2Q. While for Disk Space usage, the HFQ and PBEOQ tend to run relatively similar. The MP2Q, however, tumultuous. This is likely due to 35 SPE calculations are happening concurrently, and finishing at relatively the same pace. This means that tmp-files are constantly being written to, from and deleted, which gives rise to a high fluctuation in the storage on disk. On all graphs, the HF-3c is barely visible as it finished so quickly and requires barely any resources to run successfully
Fig. 9
Fig. 9
A. RMSD between optimised conformers at the HFQ vs. formula image level. Ranges from [0. 0.0010] Å. B. Energy difference between the PESs generated at the HFQ vs. formula image level. Ranges from [-0.02 0.02] kcalmol C. RAM consumption required to produce a set of optimised structures at the HFQ vs. formula image level. Whilst both top at around the same order of magnitude, it is baffling how well the RIJK optimisation approximates the Coulombic and Exchange Integrals by speeding up the wallclock time up to 800% ! D. Disk Space required to produce a set of optimised structures at the HFQ vs. formula image level. The explanation is analogous to the previous point made. E. RMSD between optimised conformers at the MP2Q vs. MP2T level. Ranges from [0. 0.0010] Å. F. Energy difference between the PESs generated at the MP2Q vs. MP2T level. Ranges from [-0.02 0.02] kcalmol G. RAM consumption required to produce a set of optimised structures at the MP2Q vs. MP2T level. There is both little overall difference in the time spent calculating (± 550h vs. ± 500h) and the RAM required to run the calculation. H. Disk Space required to produce a set of optimised structures at the MP2Q vs. MP2T level. Though there is a noticeable difference in stored tmp-files space, it does not feel like this can be a reason for opting for the TZVP basis set
Fig. 10
Fig. 10
All the PESs (Fig. 3) are compared against the GSQ, returning the results of a ΔΔE PES. Ranges from [0. 3.0] kcalmol. The bottom row is rather monochromic because the scale of the plot has been magnified, with respect to the scale in Fig.4A. . The bottom right square is the GSQ, which depicts a flat off-white colour equal to the value of zero, all around
Fig. 11
Fig. 11
Comparison, like in Fig. 4B., of all sets of optimised structures at the respective levels, compared with each other. Fig. 4B. is present here as the uttermost right column. Ranges from [0. 0.10] Å
Fig. 12
Fig. 12
A visualisation of the inversion protocol, as detailed by Cremer et al. [38] and Sega et al. [39]. This stage is after the local elevation has been calculated from the CP coordinates. A., B. Derive projected bond lengths and bond angles has been and divide projected bond lengths and bond angles into their respective segments. C., D. Assign coordinates of the projected segments onto a cartesian plane and rotate segments onto the virtual OPQ triangle. This assembles the segments together. E., F. Rotate the aligned segments onto the mean plane. By adding the local elevation as the z-coordinate, the molecular structure is finalised
Fig. 13
Fig. 13
The conformations reported, for ϵ=1, according to the Altona-Sundaralingam formalism [19] from Cornell et al. [8]. Puckering coordinates were converted to Cremer-Pople and inverted. Depicted hydrogens and nitrogen, representing the nucleobase, were added using UCSF Chimera [42]. This visualises that the reported denotation of the respective conformation [2E, 3E, O4E] are more akin to Twists than to envelopes, as also noted on the pseudorotational wheel in Fig.5
Fig. 14
Fig. 14
The relationship between the Cremer-Pople sphere and the Mollweide geographical projection. To avoid confusion, we note that the Cremer-Pople coordinates employ the physics convention and notation of the spherical coordinates (θ[0,π], ϕ2[0,2π]), while the cartopy library utilises the mathematical convention and (θ[-π,π], ϕ[-π/2,π/2]). The CP sphere represents all the puckering conformations a pyranose ring can adopt. The CP-sphere used the physics convention of the spherical coordinates (r,θ,ϕ) with 0θ2π. In the CP sphere, the ϕ2 is the latitude and θ is the longitude. The r coordinate does not vary much for conformations of interest, and is kept constant for the sake of visualisation. Note that in this context, the 3 atom in the standard pyranose ring is the same as the nitrogen atom of the morpholino ring. For the purpose of generalisability, the conformers shown here are simple pyranose sugars. Some notable conformations have been explicitly depicted. The locations may not be highly accurate and serve mostly as a visual representation. More extensive CP spheres can be found in literature [44, 18, 1, 39]. Conformers are denoted by their abbreviation and specified with the out-of-plane atoms. The IUPAC notation for pyranose rings [45, 46] defines a total of thirty-eight distinct conformers, with chair (C), boat (B), skew (S), halfchair/twists (H/T) and envelope (E) for six-membered rings. A representation of the conformers on the CP sphere can be found in literature [41, 1]
Fig. 15
Fig. 15
A. Five-membered ring systems : examples of Twist (T) and Envelope (E) conformations. B. The Altona-Sundaralingam pseudorotation wheel. For reference, the regions where puckering configurations typical to DNA and RNA respectively are highlighted. C. Six-membered ring systems : Examples of Twist (T), Envelope (E), Chair (C), Boat (B) and Skew (S) conformations. Literature on these systems make mention of the halfboat conformer, which is the same a skew (three consecutive atoms are in the same plane, with the fourth in-plane atom located in between two out-of-plane atoms and both latter atoms are on either side of the plane). Similarly, twists are often referred to as halfchairs and (a conformation where only two consecutive atoms both exist out-of-plane, one on either side of the plane). D. The Cremer-Pople sphere with the 38 distinct puckering modes are labelled onto the surface of the sphere

Similar articles

References

    1. Lescrinier E, Froeyen M, Herdewijn P (2003) Difference in conformational diversity between nucleic acids with a six-membered ‘sugar’unit and natural ‘furanose’nucleic acids. Nucleic Acids Res 31(12):2975–2989 - PMC - PubMed
    1. Babin V, Roland C, Darden TA et al (2006) The free energy landscape of small peptides as obtained from metadynamics with umbrella sampling corrections. J Chem Phys 125:20 - PMC - PubMed
    1. Pérez A, Marchán I, Svozil D et al (2007) Refinement of the amber force field for nucleic acids: Improving the description of formula image/formula image conformers. Biophys J 92(11):3817–3829. 10.1529/biophysj.106.097782 - PMC - PubMed
    1. Zgarbová M, Otyepka M, Šponer J et al (2011) Refinement of the cornell et al nucleic acids force field based on reference quantum chemical calculations of glycosidic torsion profiles. J Chem Theor Comput 7(9):2886–2902. 10.1021/ct200162x - PMC - PubMed
    1. Zgarbová M, Šponer J, Otyepka M et al (2015) Refinement of the sugar-phosphate backbone torsion beta for amber force fields improves the description of z- and b-dna. J Chem Theor Comput 11(12):5723–5736. 10.1021/acs.jctc.5b00716 - PubMed

LinkOut - more resources