Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan 11;8(5):2678-2707.
doi: 10.1039/c7ra11829e. eCollection 2018 Jan 9.

Introducing DDEC6 atomic population analysis: part 4. Efficient parallel computation of net atomic charges, atomic spin moments, bond orders, and more

Affiliations

Introducing DDEC6 atomic population analysis: part 4. Efficient parallel computation of net atomic charges, atomic spin moments, bond orders, and more

Nidia Gabaldon Limas et al. RSC Adv. .

Abstract

The DDEC6 method is one of the most accurate and broadly applicable atomic population analysis methods. It works for a broad range of periodic and non-periodic materials with no magnetism, collinear magnetism, and non-collinear magnetism irrespective of the basis set type. First, we show DDEC6 charge partitioning to assign net atomic charges corresponds to solving a series of 14 Lagrangians in order. Then, we provide flow diagrams for overall DDEC6 analysis, spin partitioning, and bond order calculations. We wrote an OpenMP parallelized Fortran code to provide efficient computations. We show that by storing large arrays as shared variables in cache line friendly order, memory requirements are independent of the number of parallel computing cores and false sharing is minimized. We show that both total memory required and the computational time scale linearly with increasing numbers of atoms in the unit cell. Using the presently chosen uniform grids, computational times of ∼9 to 94 seconds per atom were required to perform DDEC6 analysis on a single computing core in an Intel Xeon E5 multi-processor unit. Parallelization efficiencies were usually >50% for computations performed on 2 to 16 cores of a cache coherent node. As examples we study a B-DNA decamer, nickel metal, supercells of hexagonal ice crystals, six X@C60 endohedral fullerene complexes, a water dimer, a Mn12-acetate single molecule magnet exhibiting collinear magnetism, a Fe4O12N4C40H52 single molecule magnet exhibiting non-collinear magnetism, and several spin states of an ozone molecule. Efficient parallel computation was achieved for systems containing as few as one and as many as >8000 atoms in a unit cell. We varied many calculation factors (e.g., grid spacing, code design, thread arrangement, etc.) and report their effects on calculation speed and precision. We make recommendations for excellent performance.

PubMed Disclaimer

Conflict of interest statement

There are no conflicts of interest to declare.

Figures

Fig. 1
Fig. 1. The triple role of AIM methods. AIM methods provide atomistic descriptors that can be used (i) to understand the chemical properties of materials, (ii) to parameterize force fields used in classical atomistic simulations, and (iii) to provide dispersion interactions in some DFT + dispersion methods or to produce localized electron distributions for use in QC methods.
Fig. 2
Fig. 2. Six pillars of great performance for an AIM method.
Fig. 3
Fig. 3. Flow diagram of the DDEC6 method as implemented in the CHARGEMOL program. The dotted line indicates the path that would be followed if using atom-centered integration grids.
Fig. 4
Fig. 4. Flow diagram of DDEC6 bond order analysis.
Fig. 5
Fig. 5. Example OpenMP directives that create threads and divide the work. The loop is parallelized over j.
Fig. 6
Fig. 6. Elements for achieving an efficient shared memory parallelization of the DDEC6 method.
Fig. 7
Fig. 7. Illustration of the maximum amount of big arrays that exist at the same time per module. The green bars are for all systems. The yellow bars indicate the additional memory for systems with collinear magnetism. The memory required for non-collinear magnetism is the sum of the green, yellow, and red bars.
Fig. 8
Fig. 8. Example of the arrangement of loop and matrix indices to maximize cache efficiency. This structure was used throughout the program.
Fig. 9
Fig. 9. Example of parallelization codes using (top) inefficient placement of REDUCTION statement and (bottom) efficient placement of REDUCTION statement. The bottom configuration is preferred because the REDUCTION clause is executed just once, while the top code executes the REDUCTION clause natoms times.
Fig. 10
Fig. 10. Parallelization timing and efficiency results for Ni bulk metal (1 atom per unit cell, PBE/planewave method). Three trials were performed to compute error bars for the computational times and parallelization efficiencies.
Fig. 11
Fig. 11. Parallelization timing and efficiency results for the B-DNA decamer (733 atoms per unit cell, PBE/planewave method). The lines mark the unit cell boundaries.
Fig. 12
Fig. 12. Computed DDEC6 bond orders in the guanine-cytosine and adenine-thymine base pairs. The hydrogen bonds are shown as dotted red lines.
Fig. 13
Fig. 13. Parallelization timing and efficiency results for the Mn12-acetate single molecule magnet (148 atoms per unit cell) that exhibits collinear magnetism. Left: PBE/planewave results. Middle: PBE/LANL2DZ results. Right: Chemical structure with Mn atoms colored by type: Mn type 1 (blue), Mn type 2 (red), Mn type 3 (yellow). The PBE/LANL2DZ computed ASMs were −2.56 (Mn type 1), 3.63 (Mn type 2), 3.57 (Mn type 3), and ≤0.077 in magnitude on all other atoms.
Fig. 14
Fig. 14. Computed bond orders (blue) and NACs (black) for the Mn12-acetate single molecule magnet using the PBE/LANL2DZ electron and spin densities. The atoms are colored by element: Mn type 1 (blue), Mn type 2 (green), Mn type 3 (yellow), O (red), C (grey), H (pink). All of the atoms (i.e., the full chemical structure) were included the DDEC6 calculation, but for display purposes only a portion of the atoms are shown here. The fragments shown here were chosen so that together they include all of the symmetry unique atoms and bonds.
Fig. 15
Fig. 15. Parallelization timing and efficiency results for the Fe4O12N4C40H52 single molecule magnet (112 atoms per unit cell, PW91/planewave method) that exhibits non-collinear magnetism. Left: Chemical structure reproduced with permission of ref. 16 (© The Royal Society of Chemistry 2017). The atoms and spin magnetization vectors are colored by: Fe (orange), O (red), N (blue), C (gray), H (white). The Fe atoms exhibited DDEC6 ASM magnitudes of 2.33, and the ASM magnitudes were negligible on the other atoms. Right: Parallelization timing and efficiency results.
Fig. 16
Fig. 16. Computed bond orders (blue) and NACs (black) for the Fe4O12N4C40H52 single molecule magnet that exhibits non-collinear magnetism. The atoms are colored by element: Fe (yellow), O (red), C (grey), N (blue), H (pink). The distorted cuboidal Fe4O4 core is shown together with one adsorbed methanol molecule and one of the organic ligands. The dashed blue line illustrates interaction between the methanol lone pair and the adjacent Fe atom (i.e., Lewis acid–base interaction). The other three adsorbed methanol molecules and three organic ligands were included in the calculation but are not shown here for display purposes.
Fig. 17
Fig. 17. The convergence rate of spin partitioning is highly predictable. A plot of the logarithm of the max_ASM_change versus iteration number is linear with a slope equal to for both collinear and non-collinear magnetism.
Fig. 18
Fig. 18. DDEC6 results for ice crystal using 12, 96, 324, 768, 1500, 2592, 4116, 6144, and 8748 atoms per unit cell (PBE/planewave method). These results show both the computational time and memory required scale linearly with increasing system size. Left: The computational time scales linearly with the number of atoms and decreases with increasing number of processors. Right: Total RAM required is almost independent of the number of processors. The predicted memory required is from eqn (17).
Fig. 19
Fig. 19. Time and memory required to complete DDEC6 analysis for a 324 atom ice unit cell with 12 700 800, 35 123 200, and 81 285 120 grid points. These calculations were run in serial mode on a single processor. Both the computational time and memory required scaled linearly with increasing number of grid points and were independent of the planewave cutoff energy.
Fig. 20
Fig. 20. Parallelization timing and efficiency results for ozone singlet, +1 doublet, and triplet at different levels of theory: B3LYP/6-311+G*, CASSCF/AUG-cc-pVTZ, CCSD/AUG-cc-pVTZ, PW91/6-311+G*, and SAC-CI/AUG-cc-pVTZ. Serial and 16 parallel processors calculations are compared.
Fig. 21
Fig. 21. Logarithm of bond order versus bond length for ozone molecule. Colors of data points indicate singlet (red), cation doublet (green), or triplet (blue) spin states. Shapes of data points indicate the exchange–correlation theory. Left: Bonds between the middle and outer atoms. Right: Bond between the two outer atoms.
Fig. 22
Fig. 22. Parallelization timing and efficiency results for: [Am@C60]+1, Cs@C60 (46 frozen Cs core electrons), Cs@C60 (54 simulated frozen Cs core electrons), [Eu@C60]+1, Li@C60, N@C60, and Xe@C60. These calculations have 61 atoms per unit cell and were computed using PBE/planewave.
Fig. 23
Fig. 23. Left: NACs (black) and bond orders (blue) of water dimer. Right: Parallelization timing and efficiency results for water dimer (6 atoms per unit cell, PBE/planewave method). The solid blue line is the overall parallelization efficiency. The dashed blue line is the parallelization efficiency excluding setting up density grids.

References

    1. Yang Q. Liu D. Zhong C. Li J.-R. Chem. Rev. 2013;113:8261–8323. doi: 10.1021/cr400005f. - DOI - PubMed
    1. Erucar I. Manz T. A. Keskin S. Mol. Simul. 2014;40:557–570. doi: 10.1080/08927022.2013.829219. - DOI
    1. Murtola T. Bunker A. Vattulainen I. Deserno M. Karttunen M. Phys. Chem. Chem. Phys. 2009;11:1869–1892. doi: 10.1039/B818051B. - DOI - PubMed
    1. Gates T. S. Odegard G. M. Frankland S. J. V. Clancy T. C. Compos. Sci. Technol. 2005;65:2416–2434. doi: 10.1016/j.compscitech.2005.06.009. - DOI
    1. Spivey J. J. Krishna K. S. Kumar C. S. S. R. Dooley K. M. Flake J. C. Haber L. H. Xu Y. Janik M. J. Sinnott S. B. Cheng Y. T. Liang T. Sholl D. S. Manz T. A. Diebold U. Parkinson G. S. Bruce D. A. de Jongh P. J. Phys. Chem. C. 2014;118:20043–20069.