Data Augmentation of a Corrosion Dataset for Defect Growth Prediction of Pipelines Using Conditional Tabular Generative Adversarial Networks
- PMID: 38473613
- PMCID: PMC10934152
- DOI: 10.3390/ma17051142
Data Augmentation of a Corrosion Dataset for Defect Growth Prediction of Pipelines Using Conditional Tabular Generative Adversarial Networks
Abstract
Due to corrosion characteristics, there are data scarcity and uneven distribution in corrosion datasets, and collecting high-quality data is time-consuming and sometimes difficult. Therefore, this work introduces a novel data augmentation strategy using a conditional tabular generative adversarial network (CTGAN) for enhancing corrosion datasets of pipelines. Firstly, the corrosion dataset is subjected to data cleaning and variable correlation analysis. The CTGAN is then used to generate external environmental factors as input variables for corrosion growth prediction, and a hybrid model based on machine learning is employed to generate corrosion depth as an output variable. The fake data are merged with the original data to form the synthetic dataset. Finally, the proposed data augmentation strategy is verified by analyzing the synthetic dataset using different visualization methods and evaluation indicators. The results show that the synthetic and original datasets have similar distributions, and the data augmentation strategy can learn the distribution of real corrosion data and sample fake data that are highly similar to the real data. Predictive models trained on the synthetic dataset perform better than predictive models trained using only the original dataset. In comparative tests, the proposed strategy outperformed other data generation methods.
Keywords: CTGAN; corroded pipeline; corrosion depth; data augmentation; machine learning.
Conflict of interest statement
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Figures










Similar articles
-
Utility-based Analysis of Statistical Approaches and Deep Learning Models for Synthetic Data Generation With Focus on Correlation Structures: Algorithm Development and Validation.JMIR AI. 2025 Mar 20;4:e65729. doi: 10.2196/65729. JMIR AI. 2025. PMID: 40112290 Free PMC article.
-
CTCN: a novel credit card fraud detection method based on Conditional Tabular Generative Adversarial Networks and Temporal Convolutional Network.PeerJ Comput Sci. 2023 Oct 10;9:e1634. doi: 10.7717/peerj-cs.1634. eCollection 2023. PeerJ Comput Sci. 2023. PMID: 37869461 Free PMC article.
-
Tabular transformer generative adversarial network for heterogeneous distribution in healthcare.Sci Rep. 2025 Mar 25;15(1):10254. doi: 10.1038/s41598-025-93077-3. Sci Rep. 2025. PMID: 40133347 Free PMC article.
-
Data Augmentation Techniques for Machine Learning Applied to Optical Spectroscopy Datasets in Agrifood Applications: A Comprehensive Review.Sensors (Basel). 2023 Oct 18;23(20):8562. doi: 10.3390/s23208562. Sensors (Basel). 2023. PMID: 37896655 Free PMC article. Review.
-
Enhancing cancer differentiation with synthetic MRI examinations via generative models: a systematic review.Insights Imaging. 2022 Dec 12;13(1):188. doi: 10.1186/s13244-022-01315-3. Insights Imaging. 2022. PMID: 36503979 Free PMC article. Review.
References
-
- Yazdi M., Khan F., Abbassi R. Operational subsea pipeline assessment affected by multiple defects of microbiologically influenced corrosion. Process Saf. Environ. Prot. 2022;158:159–171. doi: 10.1016/j.psep.2021.11.032. - DOI
-
- Ben Seghier M.E.A., Keshtegar B., Taleb-Berrouane M., Abbassi R., Trung N.-T. Advanced intelligence frameworks for predicting maximum pitting corrosion depth in oil and gas pipelines. Process Saf. Environ. Prot. 2021;147:818–833. doi: 10.1016/j.psep.2021.01.008. - DOI
-
- Arzaghi E., Abbassi R., Garaniya V., Binns J., Chin C., Khakzad N., Reniers G. Developing a dynamic model for pitting and corrosion-fatigue damage of subsea pipelines. Ocean Eng. 2018;150:391–396. doi: 10.1016/j.oceaneng.2017.12.014. - DOI
-
- Khan F., Yarveisy R., Abbassi R. Cross-country pipeline inspection data analysis and testing of probabilistic degradation models. J. Pipeline Sci. Eng. 2021;1:308–320. doi: 10.1016/j.jpse.2021.09.004. - DOI
-
- Foorginezhad S., Mohseni-Dargah M., Firoozirad K., Aryai V., Razmjou A., Abbassi R., Garaniya V., Beheshti A., Asadnia M. Recent Advances in Sensing and Assessment of Corrosion in Sewage Pipelines. Process Saf. Environ. Prot. 2021;147:192–213. doi: 10.1016/j.psep.2020.09.009. - DOI
Grants and funding
LinkOut - more resources
Full Text Sources