Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 30;12(12):1504.
doi: 10.3390/mi12121504.

Nonlinear Hyperparameter Optimization of a Neural Network in Image Processing for Micromachines

Affiliations

Nonlinear Hyperparameter Optimization of a Neural Network in Image Processing for Micromachines

Mingming Shen et al. Micromachines (Basel). .

Abstract

Deep neural networks are widely used in the field of image processing for micromachines, such as in 3D shape detection in microelectronic high-speed dispensing and object detection in microrobots. It is already known that hyperparameters and their interactions impact neural network model performance. Taking advantage of the mathematical correlations between hyperparameters and the corresponding deep learning model to adjust hyperparameters intelligently is the key to obtaining an optimal solution from a deep neural network model. Leveraging these correlations is also significant for unlocking the "black box" of deep learning by revealing the mechanism of its mathematical principle. However, there is no complete system for studying the combination of mathematical derivation and experimental verification methods to quantify the impacts of hyperparameters on the performances of deep learning models. Therefore, in this paper, the authors analyzed the mathematical relationships among four hyperparameters: the learning rate, batch size, dropout rate, and convolution kernel size. A generalized multiparameter mathematical correlation model was also established, which showed that the interaction between these hyperparameters played an important role in the neural network's performance. Different experiments were verified by running convolutional neural network algorithms to validate the proposal on the MNIST dataset. Notably, this research can help establish a universal multiparameter mathematical correlation model to guide the deep learning parameter adjustment process.

Keywords: deep neural network; hyperparameters; image processing; multiparameter mathematical correlation model.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

Figure 1
Figure 1
Training accuracies obtained under different q (%).
Figure 2
Figure 2
Model convergence under different q: (a) convergence of training cross-entropy loss; (b) convergence of test accuracy.
Figure 3
Figure 3
Training accuracies obtained under different m (%).
Figure 4
Figure 4
Model convergence under different m: (a) convergence of training cross-entropy loss; (b) convergence of test accuracy.
Figure 5
Figure 5
Training accuracies obtained under different lr (%).
Figure 6
Figure 6
Model convergence with different lr: (a) convergence of training cross-entropy loss; (b) convergence of testing accuracy.
Figure 7
Figure 7
Training accuracies obtained under different ke (%).
Figure 8
Figure 8
Model convergence with ke: (a) convergence of training cross-entropy loss; (b) convergence of test accuracy.
Figure 9
Figure 9
Training accuracies and test accuracies obtained for the confirmatory experiment (%).
Figure 10
Figure 10
Model convergence for the confirmatory experiment: (a) convergence of training cross-entropy loss; (b) convergence of test accuracy.
Figure 11
Figure 11
Statistical box plots of LRelu, Relu, and Selu models under different steps for 10 times experiment results: (a) testing accuracy of LRelu; (b) training loss of LRelu; (c) testing accuracy of Relu; (d) training loss of Relu; (e) testing accuracy of Selu; (f) training loss of Selu.

References

    1. Lu S., Ren C., Zhang J., Zhai Q., Liu W. A Novel Approach to Droplet’s 3D Shape Recovery Based on Mask R-CNN and Improved Lambert-Phong Model. Micromachines. 2018;9:462. doi: 10.3390/mi9090462. - DOI - PMC - PubMed
    1. Li C., Qiu Z., Cao X., Chen Z., Gao H., Hua Z. Hybrid Dilated Convolution with Multi-scale Residual Fusion Network for Hyperspectral Image Classification. Micromachines. 2021;12:545. doi: 10.3390/mi12050545. - DOI - PMC - PubMed
    1. Alameh M., Abbass Y., Ibrahim A., Valle M. Smart Tactile Sensing Systems Based on Embedded CNN Implementations. Micromachines. 2020;11:103. doi: 10.3390/mi11010103. - DOI - PMC - PubMed
    1. Hinton G.E. Neural Networks: Tricks of the Trade. Volume 7700. Springer; Berlin/Heidelberg, Germany: 2012. A Practical Guide to Training Restricted Boltzmann Machines; pp. 599–619.
    1. Larochelle H., Erhan D., Courville A., Bergstra J., Bengio Y. An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation. ACM Int. Conf. Proc. Ser. 2007;227:473–480.

LinkOut - more resources