Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jun 25;116(26):12660-12665.
doi: 10.1073/pnas.1816039116. Epub 2019 Jun 11.

Exploration of the chemical space and its three historical regimes

Affiliations

Exploration of the chemical space and its three historical regimes

Eugenio J Llanos et al. Proc Natl Acad Sci U S A. .

Erratum in

Abstract

Chemical research unveils the structure of chemical space, spanned by all chemical species, as documented in more than 200 y of scientific literature, now available in electronic databases. Very little is known, however, about the large-scale patterns of this exploration. Here we show, by analyzing millions of reactions stored in the Reaxys database, that chemists have reported new compounds in an exponential fashion from 1800 to 2015 with a stable 4.4% annual growth rate, in the long run neither affected by World Wars nor affected by the introduction of new theories. Contrary to general belief, synthesis has been the means to provide new compounds since the early 19th century, well before Wöhler's synthesis of urea. The exploration of chemical space has followed three statistically distinguishable regimes. The first one included uncertain year-to-year output of organic and inorganic compounds and ended about 1860, when structural theory gave way to a century of more regular and guided production, the organic regime. The current organometallic regime is the most regular one. Analyzing the details of the synthesis process, we found that chemists have had preferences in the selection of substrates and we identified the workings of such a selection. Regarding reaction products, the discovery of new compounds has been dominated by very few elemental compositions. We anticipate that the present work serves as a starting point for more sophisticated and detailed studies of the history of chemistry.

Keywords: World War; chemical reactions; chemical space; history of chemistry; structural theory.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Growth of compounds. (A) Annual number of new compounds (black) and plot of the fitted equation (left axis [l.a.]; Materials and Methods and Eq. 2). WW1 and WW2 indicate the World War periods and the vertical dotted lines the change of production regime. Annual number of new compositions (l.a., red) and fraction of new synthesized compounds to the total of new ones (right axis [r.a.], blue) are shown. (B) Distance among compositions of successive years (l.a.; Materials and Methods; and SI Appendix, Fig. S3), box plots of the compositions every 10 y (SI Appendix, Fig. S4) with interwhisker distance accounting for 99.9% of the data, and most popular combinations of elements (compositions) reported in new compounds. Relative frequency of compositions is shown in the r.a. (C) Annual fraction of new compounds containing C, H, N, O, halogens, and platinum metals (PMs). Distributions are convoluted using the moving-average method with a 5-y window. (D) Annual number of new compounds reported in some specialized journals.
Fig. 2.
Fig. 2.
Use and production of compounds. (A and B) Frequency distributions of participation of compounds in R different reactions as (A) substrates and (B) products. The left-hand side of the distributions corresponds to the many compounds appearing in few reactions, whereas the right-hand side corresponds to the few compounds appearing in many reactions. (C) Frequency distribution of uses of substrates that have participated in a one-substrate reaction. The following frequently used substrates are pinpointed: acetic anhydride (Ac2O), methyl iodide (MeI), methanol (MeOH), ethanol (EtOH), water (H2O), and formaldehyde (CH2O). (D) Distribution of uses of substrates i and j that have participated in a two-substrate reaction. The size of each point is proportional to the frequency of use of the pair {i,j} in reactions. (D, Inset) Frequency distribution of use of j in two-substrate reactions whose use(i)103 is fixed, where some frequently used substrates are shown (C).

References

    1. van Brakel J., Substances: The Ontology of Chemistry (North-Holland-Elsevier, 2012), pp. 171–209.
    1. Keserü G. M., Soos T., Kappe C. O., Anthropogenic reaction parameters - the missing link between chemical intuition and the available chemical space. Chem. Soc. Rev. 43, 5387–5399 (2014). - PubMed
    1. Schummer J., Scientometric studies on chemistry I: The exponential growth of chemical substances, 1800–1995. Scientometrics 39, 107–123 (1997).
    1. Fialkowski M., Bishop K. J. M., Chubukov V. A., Campbell C. J., Grzybowski B. A., Architecture and evolution of organic chemistry. Angew. Chem. Int. Ed. Eng. 44, 7263–7269 (2005). - PubMed
    1. Lawson A. J., Swienty-Busch J., Géoui T., Evans D., The Making of Reaxys - Towards Unobstructed Access to Relevant Chemistry Information (American Chemical Society, 2014), chap. 8, pp. 127–148.

Publication types

LinkOut - more resources