. 2021 Nov 30;12(12):1504.

doi: 10.3390/mi12121504.

Nonlinear Hyperparameter Optimization of a Neural Network in Image Processing for Micromachines

Mingming Shen^{1

2}, Jing Yang^{1

3}, Shaobo Li^{1

3}, Ansi Zhang^{1

3}, Qiang Bai¹

Affiliations

¹ School of Mechanical Engineering, Guizhou University, Guiyang 550025, China.
² School of Mechanical & Electrical Engineering, Guizhou Normal University, Guiyang 550025, China.
³ State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China.

PMID: 34945353
PMCID: PMC8704841
DOI: 10.3390/mi12121504

Nonlinear Hyperparameter Optimization of a Neural Network in Image Processing for Micromachines

Mingming Shen et al. Micromachines (Basel). 2021.

. 2021 Nov 30;12(12):1504.

doi: 10.3390/mi12121504.

Authors

Mingming Shen^{1

2}, Jing Yang^{1

3}, Shaobo Li^{1

3}, Ansi Zhang^{1

3}, Qiang Bai¹

Affiliations

¹ School of Mechanical Engineering, Guizhou University, Guiyang 550025, China.
² School of Mechanical & Electrical Engineering, Guizhou Normal University, Guiyang 550025, China.
³ State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China.

PMID: 34945353
PMCID: PMC8704841
DOI: 10.3390/mi12121504

Abstract

Deep neural networks are widely used in the field of image processing for micromachines, such as in 3D shape detection in microelectronic high-speed dispensing and object detection in microrobots. It is already known that hyperparameters and their interactions impact neural network model performance. Taking advantage of the mathematical correlations between hyperparameters and the corresponding deep learning model to adjust hyperparameters intelligently is the key to obtaining an optimal solution from a deep neural network model. Leveraging these correlations is also significant for unlocking the "black box" of deep learning by revealing the mechanism of its mathematical principle. However, there is no complete system for studying the combination of mathematical derivation and experimental verification methods to quantify the impacts of hyperparameters on the performances of deep learning models. Therefore, in this paper, the authors analyzed the mathematical relationships among four hyperparameters: the learning rate, batch size, dropout rate, and convolution kernel size. A generalized multiparameter mathematical correlation model was also established, which showed that the interaction between these hyperparameters played an important role in the neural network's performance. Different experiments were verified by running convolutional neural network algorithms to validate the proposal on the MNIST dataset. Notably, this research can help establish a universal multiparameter mathematical correlation model to guide the deep learning parameter adjustment process.

Keywords: deep neural network; hyperparameters; image processing; multiparameter mathematical correlation model.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

**Figure 1**
Training accuracies obtained under different q (%).

**Figure 2**
Model convergence under different q: (a) convergence of training cross-entropy loss; (b) convergence of test accuracy.

**Figure 3**
Training accuracies obtained under different m (%).

**Figure 4**
Model convergence under different m: (a) convergence of training cross-entropy loss; (b) convergence of test accuracy.

**Figure 5**
Training accuracies obtained under different lr (%).

**Figure 6**
Model convergence with different lr: (a) convergence of training cross-entropy loss; (b) convergence of testing accuracy.

**Figure 7**
Training accuracies obtained under different ke (%).

**Figure 8**
Model convergence with ke: (a) convergence of training cross-entropy loss; (b) convergence of test accuracy.

**Figure 9**
Training accuracies and test accuracies obtained for the confirmatory experiment (%).

**Figure 10**
Model convergence for the confirmatory experiment: (a) convergence of training cross-entropy loss; (b) convergence of test accuracy.

**Figure 11**
Statistical box plots of LRelu, Relu, and Selu models under different steps for 10 times experiment results: (a) testing accuracy of LRelu; (b) training loss of LRelu; (c) testing accuracy of Relu; (d) training loss of Relu; (e) testing accuracy of Selu; (f) training loss of Selu.

See this image and copyright information in PMC

References

1. Lu S., Ren C., Zhang J., Zhai Q., Liu W. A Novel Approach to Droplet’s 3D Shape Recovery Based on Mask R-CNN and Improved Lambert-Phong Model. Micromachines. 2018;9:462. doi: 10.3390/mi9090462. - DOI - PMC - PubMed
1. Li C., Qiu Z., Cao X., Chen Z., Gao H., Hua Z. Hybrid Dilated Convolution with Multi-scale Residual Fusion Network for Hyperspectral Image Classification. Micromachines. 2021;12:545. doi: 10.3390/mi12050545. - DOI - PMC - PubMed
1. Alameh M., Abbass Y., Ibrahim A., Valle M. Smart Tactile Sensing Systems Based on Embedded CNN Implementations. Micromachines. 2020;11:103. doi: 10.3390/mi11010103. - DOI - PMC - PubMed
1. Hinton G.E. Neural Networks: Tricks of the Trade. Volume 7700. Springer; Berlin/Heidelberg, Germany: 2012. A Practical Guide to Training Restricted Boltzmann Machines; pp. 599–619.
1. Larochelle H., Erhan D., Courville A., Bergstra J., Bengio Y. An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation. ACM Int. Conf. Proc. Ser. 2007;227:473–480.

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Nonlinear Hyperparameter Optimization of a Neural Network in Image Processing for Micromachines

Affiliations

Nonlinear Hyperparameter Optimization of a Neural Network in Image Processing for Micromachines

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources