Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Sep 24;15(1):32757.
doi: 10.1038/s41598-025-07403-w.

Machine learning model optimization for flood susceptibility zonation over the Kosi megafan, Himalayan foreland basin, India

Affiliations

Machine learning model optimization for flood susceptibility zonation over the Kosi megafan, Himalayan foreland basin, India

Aman Arora et al. Sci Rep. .

Abstract

The Kosi Megafan, located in the Himalayan foreland basin, is highly susceptible to devastating floods, posing significant threats to lives and livelihoods. Accurate flood susceptibility mapping is crucial for effective flood risk management in this dynamic environment. This study evaluates and optimizes five advanced machine learning algorithms - Random Subspace, J48, Maximum Entropy (MaxEnt), Artificial Neural Network (ANN-MLP), and Biogeography-Based Optimization- for flood susceptibility zonation within the Kosi Megafan. A comprehensive dataset incorporating 19 conditioning factors, derived from ALOS PALSAR DEM, Sentinel-2A, Landsat 5 TM, ENVISAT-1 ASAR (ENVISAT-1 Advanced Synthetic Aperture Radar), and other ancillary data sources, was used to train and validate the models. Model performance was assessed using a suite of metrics, including accuracy, true skill statistics (TSS), sensitivity, specificity, Kappa, AUC, and the Seed Cell Area Index. Notably, the ANN-MLP model demonstrated exceptional performance on the validation dataset, achieving an accuracy of 0.982, TSS of 0.964, and Kappa of 0.964, outperforming the other models. MaxEnt also exhibited strong performance, confirming its robustness in environmental modeling. The analysis of variable importance revealed that normalized difference vegetation index (NDVI), altitude, distance to road, rainfall, and distance to river were the most influential factors governing flood susceptibility in the region. The generated flood susceptibility maps, particularly those derived from the ANN-MLP and MaxEnt models, provide valuable tools for identifying high-risk areas and informing flood mitigation strategies. This study highlights the potential of advanced machine learning techniques, especially ANN-MLP, in significantly improving the accuracy and reliability of flood susceptibility assessments in complex and dynamic environments like the Kosi Megafan, paving the way for more effective flood risk management and disaster preparedness.

Keywords: Climate change; Disaster management; Flood modelling; Flood risk; Himalayan foothill zone; Land use changes; Natural hazards; Urbanization.

PubMed Disclaimer

Conflict of interest statement

Declaration. Competing interests: The authors confirm that they have no known financial conflicts of interest or personal relationships that could have influenced the work presented in this paper.

Figures

Fig. 1
Fig. 1
(A) Location map of Central Middle Ganga Plain (CMGP); (B) Mean monthly discharge during monsoon period (Jun to October) at Gandhighat station for 2008. (Note: This map was composed using ArcMap 10.8 version in July 2024. (NOTE: The corresponding author (MP) thanks UCRD, Chandigarh University, Mohali, Punjab, India, for providing the lab facilities, e.g. licensed version of ArcGIS 10.8).
Fig. 2
Fig. 2
Flowchart of methodology.
Fig. 3
Fig. 3
Flood inventory preparation.
Fig. 4
Fig. 4
From A to J: the maps indicate: Altitude, Distance to Lineament, Distance to River, Distance to Road, Geomorphology, Longitudinal Curvature, Land Use Land Cover, Normalized Difference Vegetation Index, Plan Curvature. (NOTE: These maps were generated by the corresponding author (MP) when he was working at UCRD, Chandigarh University, Mohali, Punjab, India, and he thanks the organisation (CU) for providing the lab facilities, e.g. licensed version of ArcGIS 10.8.). From J to S indicate: Profile Curvature, Average Annual Rainfall, Slope (Degree), Slope Aspect, Soil Type, Stream Density, Stream Potential Index, Topographical Potential Index, Topographical Ruggedness Index, Topographical Wetness Index. (NOTE: These conditioning factors’ maps were generated by the corresponding author (MP) when he was working at UCRD, Chandigarh University, Mohali, Punjab, India, and he thanks the organisation (CU) for providing the lab facilities, e.g. licensed version of ArcGIS 10.8).
Fig. 4
Fig. 4
From A to J: the maps indicate: Altitude, Distance to Lineament, Distance to River, Distance to Road, Geomorphology, Longitudinal Curvature, Land Use Land Cover, Normalized Difference Vegetation Index, Plan Curvature. (NOTE: These maps were generated by the corresponding author (MP) when he was working at UCRD, Chandigarh University, Mohali, Punjab, India, and he thanks the organisation (CU) for providing the lab facilities, e.g. licensed version of ArcGIS 10.8.). From J to S indicate: Profile Curvature, Average Annual Rainfall, Slope (Degree), Slope Aspect, Soil Type, Stream Density, Stream Potential Index, Topographical Potential Index, Topographical Ruggedness Index, Topographical Wetness Index. (NOTE: These conditioning factors’ maps were generated by the corresponding author (MP) when he was working at UCRD, Chandigarh University, Mohali, Punjab, India, and he thanks the organisation (CU) for providing the lab facilities, e.g. licensed version of ArcGIS 10.8).
Fig. 5
Fig. 5
Variable importance plot based on Mean Decrease Gini from Random Forest.
Fig. 6
Fig. 6
Flood susceptibility maps generated by (A) ANN-MLP, (B) BBO, (C) J48, (D) MaxEnt, and (E) RSP models. (NOTE: These maps were generated by the corresponding author (MP) when he was working at UCRD, Chandigarh University, Mohali, Punjab, India, and he thanks the organisation (CU) for providing the lab facilities, e.g. licensed version of ArcGIS 10.8.
Fig. 7
Fig. 7
Percentage of the study area covered by each flood susceptibility class for each model.
Fig. 8
Fig. 8
AUC curves for (A) Training and (B) Validation datasets.
Fig. 9
Fig. 9
(A) Frequency Ratio and (B) SCAI diagrams for model validation.

References

    1. Vojtek, M. et al. Comparison of multi-criteria-analytical hierarchy process and machine learning-boosted tree models for regional flood susceptibility mapping: a case study from Slovakia. Geomat. Nat. Haz. Risk12(1), 1153–1180. 10.1080/19475705.2021.1912835 (2021). - DOI
    1. EMDAT. (2019). OFDA/CRED International Disaster Database, Université catholique de Louvain – Brussels – Belgium.
    1. Janizadeh, S. et al. Prediction Success of Machine Learning Methods for Flash Flood Susceptibility Mapping in the Tafresh Watershed. Iran. Sustain.11(19), 5426. 10.3390/su11195426 (2019). - DOI
    1. Singha, C. et al. Spatial analysis of flood hazard zoning map using novel hybrid machine learning technique in Assam, India. Remote Sens.14(24), 6229. 10.3390/rs14246229 (2022). - DOI
    1. Xu, Y. Flood Forecasting Method and Application Based on Informer Model. Water16(5), 765. 10.3390/w16050765 (2024). - DOI

LinkOut - more resources