Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 31:18:4411-4430.
doi: 10.2147/JMDH.S523137. eCollection 2025.

Big Data Analytics for Uncovering Voxel Connectivity Patterns in Attention Deficit Hyperactivity Disorder

Affiliations

Big Data Analytics for Uncovering Voxel Connectivity Patterns in Attention Deficit Hyperactivity Disorder

Rezzy Eko Caraka et al. J Multidiscip Healthc. .

Abstract

Introduction: Attention Deficit Hyperactivity Disorder (ADHD) is a complex neurodevelopmental condition characterized by heterogeneous brain activity patterns. Identifying key brain regions associated with ADHD remains a challenge due to the high dimensionality and complexity of neuroimaging data. This study aims to apply advanced machine learning techniques to uncover critical features and improve classification performance in ADHD diagnosis.

Methods: We analyzed 5937 brain voxels aggregated from neuroimaging records of patients diagnosed with ADHD. Feature selection was performed using Boruta, Random Forest in combination with DALEX explainability tools, and Neural Networks. Dimensionality reduction and clustering techniques including Principal Component Analysis (PCA), KMeans, and MCLUST were used to explore underlying voxel patterns. The performance of different activation functions-ReLU, Sigmoid, and Tanh-was evaluated within deep neural networks.

Results: Several key brain regions, including the Fusiform Gyrus, Thalamus, and Superior Temporal Gyrus, were identified as significant predictors for ADHD. The integration of machine learning models demonstrated improved classification accuracy, with ReLU-based neural networks outperforming others in most evaluation metrics.

Discussion: The study demonstrates the potential of a robust, integrated machine learning framework to analyze high-dimensional neuroimaging data and identify biologically relevant markers of ADHD. These findings contribute to the growing body of evidence supporting data-driven approaches in neuropsychiatric diagnosis and may inform future clinical decision-making and personalized interventions.

Keywords: ADHD; activation function; brain voxels; deep learning; feature selection; machine learning; neuroimaging.

PubMed Disclaimer

Conflict of interest statement

The authors have no competing interests to declare that are relevant to the content of this article.

Figures

Figure 1
Figure 1
Step construction. Alt text: Flowchart illustrating the process of model construction using a brain dataset. The process begins with K-fold cross-validation, where the dataset is split into K folds. A model is trained on the training data and evaluated on the validation data, with this process repeated for all folds. The average evaluation result is then calculated. Based on this result, feature selection may be applied using methods such as Random Forest Importance, Backward Selection, Forward Selection, or Stepwise Selection. If no feature selection is applied, all variables are used. Different models, including Random Forest, XGBoost, Boruta, Lasso, SVM, and Neural Networks, are trained using the selected features. If all variables are used, a Deep Neural Network is trained. The final step involves using the trained models for prediction. Diamonds represent decision points, and rectangles represent process steps, connected by arrows indicating the flow of data.
Figure 2
Figure 2
Brain network connectivity visualizations from selected individuals. Each panel shows functional connectivity patterns derived from neuroimaging data, illustrating inter-individual variability across the sample. The functional brain network connectivity of ten individuals is visualized using circular plots, each representing the connectivity patterns across labeled brain regions. These plots illustrate inter-regional correlations derived from neuroimaging data, with connection strength and directionality encoded through color: blue for positive correlations and red for negative ones. The plots are arranged in two rows, each containing five individual connectivity maps. Brain regions are denoted using standardized abbreviations such as “PCL_R_1_3” and “THA_L_3”, and connections are depicted as lines bridging these regions. Below each plot, demographic information—sex (M or (F) and age group (ranging from 22–25 to 31–35 years)—is provided. Visual inspection reveals considerable variability in the density and distribution of network connectivity across individuals, despite similarities in age or sex. Some individuals display predominantly positive connectivity (eg, Individuals 1, 3, 7, and 9), while others exhibit more balanced or even negative connectivity patterns (eg, Individuals 5, 6, and 10). This variability underscores the heterogeneity of functional brain networks across individuals and highlights the complex interplay between demographic and possibly intrinsic neurological factors. A color scale adjacent to each plot provides reference for interpreting the range of correlation values.
Figure 3
Figure 3
Best feature age (a) and sex (b) using PCA. Alt Text: Two sets of PCA scatterplots comparing data with and without scaling. (a) shows PCA results highlighting features related to age, while (b) shows PCA results highlighting features related to sex. Each panel has two plots: the left without scaling, showing data points tightly clustered with a few outliers, and the right with scaling, displaying a more dispersed distribution of points. Red arrows indicate key features.
Figure 4
Figure 4
Best KMeans cluster age (a) and sex (b). Alt text: PCA biplot with two outliers (“2487_2” and “PrG_89”). Right: Scaled PCA with dense clusters and pink labels and PCA biplot with three outliers (“Vermis X”, “LOcC_R_4_2”, “IFG_161”).
Figure 5
Figure 5
Best XGBoost after scaling.
Figure 6
Figure 6
Best feature selection using Boruta. Alt text: Feature importance plot from the Boruta algorithm. Green points represent important features, black circles are shadow features, and yellow points are tentative features.
Figure 7
Figure 7
Best simulation using various machine learning. Alt text: This figure presents the performance and interpretability of neural networks using four different methods. (a) shows the SHAP plot, which illustrates the contribution of each feature to the model’s predictions, highlighting both the magnitude and direction of their impact. (b) displays the LIME plot, offering local interpretability by showing how individual features influence specific predictions. (c) presents the DALEX plot, which visualizes feature importance and model performance across various feature values. Lastly, (d) shows the Partial Dependence Plot (PDP), illustrating the marginal effect of selected features on the predicted outcome, thereby revealing the relationships between input variables and model predictions.
Figure 8
Figure 8
Deep neural network metric evaluation time computing (a) and balanced accuracy (b). Alt text: This figure compares two key performance metrics of the deep neural network. (a) illustrates the time required for model computation, providing insights into the efficiency and scalability of the neural network. (b) presents the balanced accuracy, which accounts for class imbalances, offering a more reliable measure of model performance across different categories. Together, these metrics highlight the trade-off between computational speed and predictive accuracy.

Similar articles

References

    1. Bledsoe J, Xiao D, Chaovalitwongse A, et al. Diagnostic classification of ADHD versus control: support vector machine classification using brief neuropsychological assessment. J Atten Disord. 2017;21:1040–1049. doi: 10.1177/1087054716649666 - DOI - PubMed
    1. Agustini M, Yufiarti, Wuryani, Yufiarti Y, Wuryani W. Development of learning media based on android games for children with attention deficit hyperactivity disorder. Int J Interactive Mobile Technol. 2020;14(6):205–213. doi: 10.3991/IJIM.V14I06.13401 - DOI
    1. Cho SC, Kim JW, Choi HJ, et al. Associations between symptoms of attention deficit hyperactivity disorder, depression, and suicide in Korean female adolescents. Depress Anxiety. 2008;25(11):E142–E146. doi: 10.1002/da.20399 - DOI - PubMed
    1. Castanho EN, Aidos H, Madeira SC. Biclustering fMRI time series: a comparative study. BMC Bioinf. 2022;23(1). doi: 10.1186/s12859-022-04733-8 - DOI - PMC - PubMed
    1. Whi W, Ha S, Kang H, Lee DS. Hyperbolic disc embedding of functional human brain connectomes using resting state fMRI. bioRxiv. 2022;6(3):745–64. doi: 10.1101/2021.03.25.436730 - DOI - PMC - PubMed

LinkOut - more resources