Analysis of momentum term in back-propagation

Masafumi Hagiwara, Akira Sato

Research output: Contribution to journalArticlepeer-review

10 Citations (Scopus)

Abstract

The back-propagation algorithm has been applied to many fields, and has shown large capability of neural networks. Many people use the back-propagation algorithm together with a momentum term to accelerate its convergence. However, in spite of the importance for theoretical studies, theoretical background of a momentum term has been unknown so far. First, this paper explains clearly the theoretical origin of a momentum term in the back-propagation algorithm for both a batch mode learning and a pattern-by-pattern learning. We will prove that the back-propagation algorithm having a momentum term can be derived through the following two assumptions: 1) The cost function is En = n/Σ/μ αn-μ Eμ, where Eμ is the summation of squared error at the output layer at the μth learning time and a is the momentum coefficient. 2) The latest weights are assumed in calculating the cost function En. Next, we derive a simple relationship between momentum, learning rate, and learning speed and then further discussion is made with computer simulation.

Original languageEnglish
Pages (from-to)1080-1086
Number of pages7
JournalIEICE Transactions on Information and Systems
VolumeE78-D
Issue number8
Publication statusPublished - 1995 Aug 1

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Analysis of momentum term in back-propagation'. Together they form a unique fingerprint.

Cite this