Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Aug 13;11(1):4068.
doi: 10.1038/s41467-020-17755-8.

Understanding the diversity of the metal-organic framework ecosystem

Affiliations

Understanding the diversity of the metal-organic framework ecosystem

Seyed Mohamad Moosavi et al. Nat Commun. .

Abstract

Millions of distinct metal-organic frameworks (MOFs) can be made by combining metal nodes and organic linkers. At present, over 90,000 MOFs have been synthesized and over 500,000 predicted. This raises the question whether a new experimental or predicted structure adds new information. For MOF chemists, the chemical design space is a combination of pore geometry, metal nodes, organic linkers, and functional groups, but at present we do not have a formalism to quantify optimal coverage of chemical design space. In this work, we develop a machine learning method to quantify similarities of MOFs to analyse their chemical diversity. This diversity analysis identifies biases in the databases, and we show that such bias can lead to incorrect conclusions. The developed formalism in this study provides a simple and practical guideline to see whether new structures will have the potential for new insights, or constitute a relatively small variation of existing structures.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Description of the three domains of MOF chemistry.
Metal centre RACs are computed on the crystal graph. Linker and functional-group RACs are computed on the corresponding linker molecular graph. Linker chemistry includes two types of RACs, namely full linker and linker connecting atoms. The graphs show the start atom (in green) and the nearby atom (in orange) used to define the RACs descriptors (see the “Methods” section).
Fig. 2
Fig. 2. Map of the pore geometry of MOFs.
To project the geometric descriptor space of MOFs to a 2D map we use the t-distributed stochastic neighbour embedding (t-SNE) method (see Supplementary Note 6 for principal component analysis (PCA)). The t-SNE method preserves local similarity, ensuring similar structures are mapped close to each other in two dimensions. a The current design space colour coded with the largest included sphere. In (b), (c), and (d), the green, blue and red dots are representing the materials in the CoRE-2019, BW-DB and ToBaCCo databases, respectively, which are overlaid on the design space represented in grey. PCA plots show a similar distribution of databases (see Supplementary Note 6).
Fig. 3
Fig. 3. Diversity metrics and maps of different domains of MOF structures.
The t-SNE method was used to project the a pore geometry, b metal chemistry, c linker chemistry and d functional groups descriptor spaces to 2D maps. Only descriptors up to the second coordination shell were included for metal chemistry to emphasize the local metal chemistry environment. In each panel, the structures from the hypothetical databases are coloured and overlaid on the entire known design space represented in grey. The radar charts show the three diversity metrics: variety (V), balance (B) and disparity (D), for the three databases. For this analysis, first we discretize the space into a fixed number of bins. Variety measures the number of bins that are sampled, balance the evenness of the distribution of materials among the sampled bins, and disparity the spread of the sampled bins (see the “Methods” section for more details).
Fig. 4
Fig. 4. Database dependence of the importance of material characteristics.
Pie charts showing the SHapley Additive exPlanations (SHAP) values (importance of variables) for a the low-pressure CO2 adsorption and b CH4 deliverable capacity. SHAP values were computed for the random forest regression models using a training set of CoRE-2019 and BW-20K, and all structures in ARABG-DB. For the chemical features, the importance of variables was summed over all RAC depths for each of the heuristic atomic properties. See the “Methods” section for the meaning of the labels. Similar values for importance of variables were obtained using other techniques (see Supplementary Note 5).
Fig. 5
Fig. 5. Impact of diversity in training data on transferability of models.
The parity plots of random forest models using full features; rows and columns correspond to the training and test sets, respectively. The dashed lines represent the parity. The size of training sets is equal in all cases. The same structures were used as test sets in each column. The diverse set was selected using the MaxMin algorithm using all geometric and chemical descriptors. The colour bars show the number of structures in each cell of the histograms.
Fig. 6
Fig. 6. Timeline of evolution of MOF geometry.
For each year, the average of relative distance in the geometry descriptor space to the MOFs reported in Cambridge structural database (CSD) in the preceding years is shown with red line. The MOFs with largest distance for some of the peaks are shown in the inset,–,,. The years on the timeline are corresponding to the year that a structure has been deposited in CSD. The grey line shows the coordination polymers reported in CSD before the beginning of the MOF chemistry as a separate field of research, shown in red.

References

    1. Moghadam PZ, et al. Development of a Cambridge Structural Database subset: a collection of metal–organic frameworks for past, present, and future. Chem. Mater. 2017;29:2618–2625.
    1. Chung YG, et al. Advances, updates, and analytics for the computation-ready, experimental metal-organic framework database: CoRE MOF 2019. J. Chem. Eng. Data. 2019;64:5985–5998.
    1. Wilmer CE, et al. Large-scale screening of hypothetical metal-organic frameworks. Nat. Chem. 2012;4:83. - PubMed
    1. Boyd PG, et al. Data-driven design of metal-organic frameworks for wet flue gas CO2 capture. Nature. 2019;576:253–256. - PubMed
    1. Eddaoudi M, et al. Systematic design of pore size and functionality in isoreticular MOFs and their application in methane storage. Science. 2002;295:469–472. - PubMed

Publication types