In Jacobs (1988), the following heuristics are proposed to accelerate the convergence of on-line back-propagation learning:
(i) Every adjustable network parameter of the cost function should have its own learn ingrate parameter.
(ii) Every learning-rate parameter should be allowed to vary from one iteration to the next.
(iii) When the derivative of the cost function with respect to a synaptic weight has the same algebraic sign for several consecutive iterations of the algorithm the learning-rate parameter for that particular weight should be increased.
(iv) When the algebraic sign of the cost function with respect to a particular synaptic weight alternates for several consecutive iterations of the algorithm, the learning-rate parameter for that weight should be decreased.
These four heuristics satisfy the locality constraint of the back-propagation algorithm.
(a) Use intuitive arguments to justify these four heuristics.
(b) The inclusion of a momentum in the weight update of the back-propagation algorithm may be viewed as a mechanism for satisfying heuristics (iii) and (iv). Demonstrate the validity of this statement.