Alleviation of Vanishing Gradient Problem Using Parametric Activation Functions


KIPS Transactions on Software and Data Engineering, Vol. 10, No. 10, pp. 407-420, Oct. 2021
https://doi.org/10.3745/KTSDE.2021.10.10.407,   PDF Download:
Keywords: deep neural network, Vanishing Gradient Problem, Parametric Activation Function, Backpropagation, Learning
Abstract

Deep neural networks are widely used to solve various problems. However, the deep neural network with a deep hidden layer frequently has a vanishing gradient or exploding gradient problem, which is a major obstacle to learning the deep neural network. In this paper, we propose a parametric activation function to alleviate the vanishing gradient problem that can be caused by nonlinear activation function. The proposed parametric activation function can be obtained by applying a parameter that can convert the scale and location of the activation function according to the characteristics of the input data, and the loss function can be minimized without limiting the derivative of the activation function through the backpropagation process. Through the XOR problem with 10 hidden layers and the MNIST classification problem with 8 hidden layers, the performance of the original nonlinear and parametric activation functions was compared, and it was confirmed that the proposed parametric activation function has superior performance in alleviating the vanishing gradient.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
Y. M. Ko and S. W. Ko, "Alleviation of Vanishing Gradient Problem Using Parametric Activation Functions," KIPS Transactions on Software and Data Engineering, vol. 10, no. 10, pp. 407-420, 2021. DOI: https://doi.org/10.3745/KTSDE.2021.10.10.407.

[ACM Style]
Young Min Ko and Sun Woo Ko. 2021. Alleviation of Vanishing Gradient Problem Using Parametric Activation Functions. KIPS Transactions on Software and Data Engineering, 10, 10, (2021), 407-420. DOI: https://doi.org/10.3745/KTSDE.2021.10.10.407.