Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 30;9(1):128.
doi: 10.1038/s41597-022-01215-7.

Aluminum alloy compositions and properties extracted from a corpus of scientific manuscripts and US patents

Affiliations

Aluminum alloy compositions and properties extracted from a corpus of scientific manuscripts and US patents

Olivia P Pfeiffer et al. Sci Data. .

Abstract

Researchers continue to explore and develop aluminum alloys with new compositions and improved performance characteristics. An understanding of the current design space can help accelerate the discovery of new alloys. We present two datasets: 1) chemical composition, and 2) mechanical properties for predominantly wrought aluminum alloys. The first dataset contains 14,884 entries on aluminum alloy compositions extracted from academic literature and US patents using text processing techniques, including 550 wrought aluminum alloys which are already registered with the Aluminum Association. The second dataset contains 1,278 entries on mechanical properties for aluminum alloys, where each entry is associated with a particular wrought series designation, extracted from tables in academic literature.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Overview of the methodology used to extract information from available literature and create useful visualization of the aluminum alloy compositional and property spaces. The Aluminum Association is abbreviated as AA.
Fig. 2
Fig. 2
Validation of composition information via dimension reduction. This scatter plot shows a 2D projection of the high-dimensional composition space for aluminum alloys that is achieved via t-distributed stochastic neighbor embedding (t-SNE). The shape of the points in the scatter indicates the source type of the alloy composition as follows: alloys registered with AA are diamond, alloys from Journal Texts are vertical line segments, alloys from Journal Tables are horizontal line segments, alloys from Patents are dots. The color of the points indicates key alloy composition information as follows: in the case of Registered Alloys, color corresponds to the alloy series (1000 is black, 2000 is red, 3000 is orange, 4000 is green, 5000 is purple, 6000 is pink, 7000 is brown, 8000 is yellow; in the case of all other source types, color corresponds to the principal alloying element (Cu is red, Mn is orange, Si is green, Mg is purple, Zn is brown, Cr is blue, Fe is turquoise, Ti is grey). Coloring is consistent based on definitions of series (e.g., 2000 series is primarily alloyed by Cu, thus both are red).
Fig. 3
Fig. 3
Verification of yield strength values. The swarm plot shows the alloy yield strengths extracted from journal article tables, grouped by the alloy’s series. The shaded regions define upper and lower yield strength bounds for each series (not available for 4000 series), as provided by educational software tool Ansys Granta Edupack, and they serve as validation for the points extracted from the literature.
Fig. 4
Fig. 4
Verification of elongation and yield strength values for 5000, 6000, and 7000 series alloys. The shaded regions define bounding ellipses for each series, as provided by educational software tool Ansys Granta Edupack, and serve as a validation for the points extracted from literature.

References

    1. Ward, C. Materials Genome Initiative for Global Competitiveness. in (2012).
    1. Dey S, Dey P, Datta S. Design of novel age-hardenable aluminium alloy using evolutionary computation. J. Alloys Compd. 2017;704:373–381. doi: 10.1016/j.jallcom.2017.02.027. - DOI
    1. Tamura R, et al. Materials informatics approach to understand aluminum alloys. Sci. Technol. Adv. Mater. 2020;21:540–551. doi: 10.1080/14686996.2020.1791676. - DOI - PMC - PubMed
    1. Olivetti EA, et al. Data-driven materials research enabled by natural language processing and information extraction. Appl. Phys. Rev. 2020;7:041317. doi: 10.1063/5.0021106. - DOI
    1. Broderick SR, Rajan K. Designing a Periodic Table for Alloy Design: Harnessing Machine Learning to Navigate a Multiscale Information Space. JOM. 2020;72:4370–4379. doi: 10.1007/s11837-020-04388-x. - DOI

Publication types