Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jan 30;9(1):967.
doi: 10.1038/s41598-018-37253-8.

Molecular Complexity Calculated by Fractal Dimension

Affiliations

Molecular Complexity Calculated by Fractal Dimension

Modest von Korff et al. Sci Rep. .

Abstract

Molecular complexity is an important characteristic of organic molecules for drug discovery. How to calculate molecular complexity has been discussed in the scientific literature for decades. It was known from early on that the numbers of substructures that can be cut out of a molecular graph are of importance for this task. However, it was never realized that the cut-out substructures show self-similarity to the parent structures. A successive removal of one bond and one atom returns a series of fragments with decreasing size. Such a series shows self-similarity similar to fractal objects. Here we used the number of distinct fragments to calculate the fractal dimension of the molecule. The fractal dimension of a molecule is a new matter constant that incorporates all features that are currently known to be important for describing molecular complexity. Furthermore, this is the first work that reveals the fractal nature of organic molecules.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
n-Hexane and four distinct alkanes obtained by successively removing one bond and one atom, followed by saturation with hydrogen.
Figure 2
Figure 2
Heptanoic acid and its distinct substructures. Grouped by bond counts.
Figure 3
Figure 3
Glucose and some of its distinct subgraphs. Subgraphs A and B are shown twice in both bent and elongated forms.
Figure 4
Figure 4
Koch curve.
Figure 5
Figure 5
n-Hexane in line notation, covered by boxes for the box counting algorithm.
Figure 6
Figure 6
The number of distinct subgraphs (N) for five example organic molecules versus the bond count of the subgraphs (γ). The numbers of distinct subgraphs on the y-axis are given in logarithmic scale.

References

    1. Sheridan RP, et al. Modeling a crowdsourced definition of molecular complexity. J. Chem. Inf. Model. 2014;54:1604–1616. doi: 10.1021/ci5001778. - DOI - PubMed
    1. Corey EJ, Wipke WT. Computer-assisted design of complex organic syntheses. Science. 1969;166:178–192. doi: 10.1126/science.166.3902.178. - DOI - PubMed
    1. Rashevsky NL. information theory, and topology. Bulletin of mathematical biology. 1955;17:229–235.
    1. Shannon CE, Weaver W, Burks AW. The mathematical theory of communication. The Bell System Technical Journal. 1948;27(379–423):623–656. doi: 10.1002/j.1538-7305.1948.tb00917.x. - DOI
    1. Mowshowitz A. Entropy and the complexity of graphs. I. An index of the relative complexity of a graph. Bull Math Biophys. 1968;30:175–204. doi: 10.1007/BF02476948. - DOI - PubMed