Analysis of momentum term in back-propagation

Masafumi Hagiwara, Akira Sato

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

The back-propagation algorithm has been applied to many fields, and has shown large capability of neural networks. Many people use the back-propagation algorithm together with a momentum term to accelerate its convergence. However, in spite of the importance for theoretical studies, theoretical background of a momentum term has been unknown so far. First, this paper explains clearly the theoretical origin of a momentum term in the back-propagation algorithm for both a batch mode learning and a pattern-by-pattern learning. We will prove that the back-propagation algorithm having a momentum term can be derived through the following two assumptions: 1) The cost function is En = n/Σ/μ αn-μ Eμ, where Eμ is the summation of squared error at the output layer at the μth learning time and a is the momentum coefficient. 2) The latest weights are assumed in calculating the cost function En. Next, we derive a simple relationship between momentum, learning rate, and learning speed and then further discussion is made with computer simulation.

Original languageEnglish
Pages (from-to)1080-1086
Number of pages7
JournalIEICE Transactions on Information and Systems
VolumeE78-D
Issue number8
Publication statusPublished - 1995 Aug

Fingerprint

Backpropagation
Momentum
Backpropagation algorithms
Cost functions
Neural networks
Computer simulation

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Information Systems
  • Software

Cite this

Analysis of momentum term in back-propagation. / Hagiwara, Masafumi; Sato, Akira.

In: IEICE Transactions on Information and Systems, Vol. E78-D, No. 8, 08.1995, p. 1080-1086.

Research output: Contribution to journalArticle

@article{1a82f21ab8c5467aa951243cefe87020,
title = "Analysis of momentum term in back-propagation",
abstract = "The back-propagation algorithm has been applied to many fields, and has shown large capability of neural networks. Many people use the back-propagation algorithm together with a momentum term to accelerate its convergence. However, in spite of the importance for theoretical studies, theoretical background of a momentum term has been unknown so far. First, this paper explains clearly the theoretical origin of a momentum term in the back-propagation algorithm for both a batch mode learning and a pattern-by-pattern learning. We will prove that the back-propagation algorithm having a momentum term can be derived through the following two assumptions: 1) The cost function is En = n/Σ/μ αn-μ Eμ, where Eμ is the summation of squared error at the output layer at the μth learning time and a is the momentum coefficient. 2) The latest weights are assumed in calculating the cost function En. Next, we derive a simple relationship between momentum, learning rate, and learning speed and then further discussion is made with computer simulation.",
author = "Masafumi Hagiwara and Akira Sato",
year = "1995",
month = "8",
language = "English",
volume = "E78-D",
pages = "1080--1086",
journal = "IEICE Transactions on Information and Systems",
issn = "0916-8532",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "8",

}

TY - JOUR

T1 - Analysis of momentum term in back-propagation

AU - Hagiwara, Masafumi

AU - Sato, Akira

PY - 1995/8

Y1 - 1995/8

N2 - The back-propagation algorithm has been applied to many fields, and has shown large capability of neural networks. Many people use the back-propagation algorithm together with a momentum term to accelerate its convergence. However, in spite of the importance for theoretical studies, theoretical background of a momentum term has been unknown so far. First, this paper explains clearly the theoretical origin of a momentum term in the back-propagation algorithm for both a batch mode learning and a pattern-by-pattern learning. We will prove that the back-propagation algorithm having a momentum term can be derived through the following two assumptions: 1) The cost function is En = n/Σ/μ αn-μ Eμ, where Eμ is the summation of squared error at the output layer at the μth learning time and a is the momentum coefficient. 2) The latest weights are assumed in calculating the cost function En. Next, we derive a simple relationship between momentum, learning rate, and learning speed and then further discussion is made with computer simulation.

AB - The back-propagation algorithm has been applied to many fields, and has shown large capability of neural networks. Many people use the back-propagation algorithm together with a momentum term to accelerate its convergence. However, in spite of the importance for theoretical studies, theoretical background of a momentum term has been unknown so far. First, this paper explains clearly the theoretical origin of a momentum term in the back-propagation algorithm for both a batch mode learning and a pattern-by-pattern learning. We will prove that the back-propagation algorithm having a momentum term can be derived through the following two assumptions: 1) The cost function is En = n/Σ/μ αn-μ Eμ, where Eμ is the summation of squared error at the output layer at the μth learning time and a is the momentum coefficient. 2) The latest weights are assumed in calculating the cost function En. Next, we derive a simple relationship between momentum, learning rate, and learning speed and then further discussion is made with computer simulation.

UR - http://www.scopus.com/inward/record.url?scp=0029354921&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0029354921&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:0029354921

VL - E78-D

SP - 1080

EP - 1086

JO - IEICE Transactions on Information and Systems

JF - IEICE Transactions on Information and Systems

SN - 0916-8532

IS - 8

ER -