Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 May 18;10(1):288.
doi: 10.1038/s41597-023-02174-3.

M100 ExaData: a data collection campaign on the CINECA's Marconi100 Tier-0 supercomputer

Affiliations

M100 ExaData: a data collection campaign on the CINECA's Marconi100 Tier-0 supercomputer

Andrea Borghesi et al. Sci Data. .

Abstract

Supercomputers are the most powerful computing machines available to society. They play a central role in economic, industrial, and societal development. While they are used by scientists, engineers, decision-makers, and data-analyst to computationally solve complex problems, supercomputers and their hosting datacenters are themselves complex power-hungry systems. Improving their efficiency, availability, and resiliency is vital and the subject of many research and engineering efforts. Still, a major roadblock hinders researchers: dearth of reliable data describing the behavior of production supercomputers. In this paper, we present the result of a ten-year-long project to design a monitoring framework (EXAMON) deployed at the Italian supercomputers at CINECA datacenter. We disclose the first holistic dataset of a tier-0 Top10 supercomputer. It includes the management, workload, facility, and infrastructure data of the Marconi100 supercomputer for two and half years of operation. The dataset (published via Zenodo) is the largest ever made public, with a size of 49.9TB before compression. We also provide open-source software modules to simplify access to the data and provide direct usage examples.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Loading times comparison across compression configurations (Parquet). The data is 4 months of gpu0_core_temp (IPMI), from May to August 2022 (included), retrieving the “timestamp”, “value” and “node” columns. Memory usage by the Pandas Dataframe is 6.55 GB in all cases. The PyArrow io_threads and cpu_count were both set to 8 (Intel Xeon Gold 5220 CPU). The data was loaded using the PyArrow Dataset API.
Fig. 2
Fig. 2
Number of metrics over time (per plugin). Yellow indicates the maximum number of metrics relative to that plugin, and black is the minimum.
Fig. 3
Fig. 3
Number of samples collected on a daily basis (per plugin). Yellow indicates the maximum number of samples relative to that plugin, and black is the minimum.

References

    1. Wei, J. et al. Status, challenges and trends of data-intensive supercomputing. CCF Transactions on High Performance Computing 1–20 (2022).
    1. Kutzner C, et al. Gromacs in the cloud: A global supercomputer to speed up alchemical drug design. Journal of Chemical Information and Modeling. 2022;62:1691–1711. doi: 10.1021/acs.jcim.2c00044. - DOI - PMC - PubMed
    1. Norman MR, et al. Unprecedented cloud resolution in a gpu-enabled full-physics atmospheric climate simulation on olcf’s summit supercomputer. The International Journal of High Performance Computing Applications. 2022;36:93–105. doi: 10.1177/10943420211027539. - DOI
    1. Makinoshima F, Oishi Y. Crowd flow forecasting via agent-based simulations with sequential latent parameter estimation from aggregate observation. Scientific Reports. 2022;12:1–13. doi: 10.1038/s41598-022-14646-4. - DOI - PMC - PubMed
    1. Huerta EA, et al. Convergence of artificial intelligence and high performance computing on nsf-supported cyberinfrastructure. Journal of Big Data. 2020;7:1–12. doi: 10.1186/s40537-020-00361-2. - DOI