Difference between revisions of "Almeida-Pineda recurrent backpropagation"

From Eyewire
Jump to: navigation, search
(Derivation)
 
(7 intermediate revisions by 4 users not shown)
Line 1: Line 1:
'''Almeida-Pineda recurrent backpropagation''' is an error-driven learning technique developed in 1987 by Luis B. Almeida<ref>{{cite journal|last=Almeida|first=Luis B. |title=A learning rule for asynchronous perceptrons with feedback in a combinatorial environment|journal=Proceedings of the IEEE First International Conference on Neural Networks|date=June 1987}} {{paywalled}}</ref> and Fernando J. Pineda.<ref>{{cite book|editor=Anderson, Dana Z.|title=Neural Information Processing Systems|chapter=Generalization of  backpropagation to recurrent neural networks|pages=602-611|year=1988|publisher=Springer|isbn=978-0883185698}}</ref><ref>{{cite journal|last=Pineda|first=Fernando J.|title=Recurrent backpropagation and the dynamical approach to adaptive neural computation|journal=Neural Computation|date=1989|volume=1|pages=161-172|url=http://authors.library.caltech.edu/13658/1/PINnc89.pdf}}</ref> It is a ''supervised'' learning technique, meaning that the desired outputs are known beforehand, and the task of the network is to learn to generate the desired outputs from the inputs.
+
<translate>
 +
 
 +
'''Almeida-Pineda recurrent backpropagation''' is an error-driven [https://en.wikipedia.org/wiki/Almeida%E2%80%93Pineda_recurrent_backpropagation supervised learning] algorithm for neural networks. It was developed in 1987 by Luis B. Almeida<ref>Almeida, Luis B. (June 1987). "A learning rule for asynchronous perceptrons with feedback in a combinatorial environment." <em>Proceedings of the IEEE First International Conference on Neural Networks</em></ref> and Fernando J. Pineda.<ref>"Generalization of  backpropagation to recurrent neural networks". In Anderson, Dana Z. <em>Neural Information Processing Systems</em> Springer (1988). pp. 602-611. ISBN 978-0883185698}}</ref><ref>Pineda, Fernando J. (1989). [http://authors.library.caltech.edu/13658/1/PINnc89.pdf "Recurrent backpropagation and the dynamical approach to adaptive neural computation"]. <em>Neural Computation</em> <strong>1</strong>: 161-172</ref> It is a ''supervised'' learning technique, meaning that the desired outputs are known beforehand, and the task of the network is to learn to generate the desired outputs from the inputs.
  
 
As opposed to a [[Feedforward backpropagation|feedforward network]], a recurrent network is allowed to have connections from any neuron to any neuron in any direction.
 
As opposed to a [[Feedforward backpropagation|feedforward network]], a recurrent network is allowed to have connections from any neuron to any neuron in any direction.
Line 5: Line 7:
 
== Model ==
 
== Model ==
  
[[File:ArtificialNeuronModel english.png|thumb|right|400px|Model of a neuron. <i>j</i> is the index of the neuron when there is more than one neuron. The activation function for backpropagation is sigmoidal.]]
+
[[File:ArtificialNeuronModel english.png|thumb|right|400px|Model of a neuron. <i>j</i> is the index of the neuron when there is more than one neuron. The activation function for backpropagation is sigmoidal. [https://www.researchgate.net/figure/A-back-propagation-neural-network-with-the-sigmoid-function-used-as-activation-function_fig5_44858457 Source]]]
[[File:Artificial_neural_network.svg|thumb|right|250px|A feedforward network. In the Almeida-Pineda model, connections may go from any neuron to any neuron, backwards or forwards.]]
+
[[File:Artificial_neural_network.jpg|thumb|right|250px|A feedforward network. In the Almeida-Pineda model, connections may go from any neuron to any neuron, backwards or forwards. [https://slideplayer.com/slide/13853243/ Source]]]
  
  
Given a set of k-dimensional inputs represented as a column vector:
+
Given a set of k-dimensional inputs with values between 0 and 1 represented as a column vector:
  
<center><math>\vec{x} = [x_1, x_2, \cdots, x_k]^T</math></center>
+
[[File:Hebb1.png|center]]
  
 
and a nonlinear neuron with (initially random, uniformly distributed between -1 and 1) synaptic weights from the inputs:
 
and a nonlinear neuron with (initially random, uniformly distributed between -1 and 1) synaptic weights from the inputs:
  
<center><math>\vec{w} = [w_1, w_2, \cdots, w_k]^T</math></center>
+
[[File:Hebb2.png|center]]
  
then the output <math>y</math> of the neuron is defined as follows:
+
then the output <em>y</em> of the neuron is defined as follows:
  
<center><math>\begin{align}
+
[[File:APRBp1.png|center]]
y &= \varphi \left ( n \right )
+
\end{align}</math></center>
+
  
where <math>\varphi \left ( \cdot \right )</math> is a sigmoidal function such as that used in ordinary [[feedforward backpropagation]] (we will use the logistic function from that page), and <math>n</math> is the net input of the neuron, calculated as follows.  Assuming <math>N</math> neurons where <math>k</math> of the neurons are simple inputs to the network, with the weight of the connection from neuron <math>i</math> to neuron <math>j</math> being <math>w_{ij}</math>, the net <math>n_j</math> of neuron <math>j</math> (where <math>j</math> is not an input neuron) is computed using a discrete time approximation to the following equation, iteratively applied to all neurons until the nets settle to some equilibrium state. Initially set <math>n_j</math> to 0 for all non-input neurons.
+
where [[File:RBM3.png]] is a sigmoidal function such as that used in ordinary [[feedforward backpropagation]] (we will use the logistic function from that page), and <em>n</em> is the net input of the neuron, calculated as follows.  Assuming <em>N</em> neurons where <em>k</em> of the neurons are simple inputs to the network, with the weight of the connection from neuron <em>i</em> to neuron <em>j</em> being [[File:FfBp6.png]], the net [[File:APRBp2.png]] of neuron <em>j</em> (where <em>j</em> is not an input neuron) is computed using a discrete time approximation to the following equation, iteratively applied to all neurons until the nets settle to some equilibrium state. Initially set [[File:APRBp2.png]] to 0 for all non-input neurons.
  
<center><math>\frac{\mathrm{d} n_j}{\mathrm{d} t} = -n_j + \sum_{i=1}^N w_{ij} \begin{cases}
+
[[File:APRBp3.png|center]]
y_i & \text{ if } i \text{ is not an input } \\
+
x_i & \text{ if } i \text{ is an input }
+
\end{cases}</math></center>
+
  
Note that if the weights between pairs of neurons are symmetric, that is, <math>w_{ij} = w_{ji}</math>, then the network is guaranteed to settle to an equilibrium state.<ref>{{cite journal|last=Hopfield|first=J. J.|title=Neurons with graded response have collective computational properties like those of two-state neurons|journal=Proceedings of the National Academy of Sciences of the United States of America|volume=81|pages=3088-3092|date=May 1984|url=http://www.pnas.org/content/81/10/3088.full.pdf}}</ref> If symmetry is not held, the network will often settle.<ref>{{cite book|chapter=Deterministic Boltzmann learning in networks with asymmetric connectivity|title=Connectionist Models: Proceedings of the 1990 Summer School|editors=Touretzky, D. S.;Elman,  J. L.; Sejnowski, T. J.; Hinton G. E.|pages=3-9|year=1991|publisher=Morgan Kaufmann Publishers|isbn=978-1558601567}}</ref> Of course, if <math>i</math> is an input, then <math>w_{ji}</math> does not exist.
+
Note that if the weights between pairs of neurons are symmetric, that is, [[File:APRBp2.png]], then the network is guaranteed to settle to an equilibrium state.<ref>Hopfield, J. J. (May 1984). [http://www.pnas.org/content/81/10/3088.full.pdf "Neurons with graded response have collective computational properties like those of two-state neurons"]. <em>Proceedings of the National Academy of Sciences of the United States of America</em> <strong>81</strong>: 3088-3092</ref> If symmetry is not held, the network will often settle.<ref>"Deterministic Boltzmann learning in networks with asymmetric connectivity". In Touretzky, D. S.;Elman,  J. L.; Sejnowski, T. J.; Hinton G. E. <em>Connectionist Models: Proceedings of the 1990 Summer School</em> Morgan Kaufmann Publishers (1991). pp. 3-9. ISBN 978-1558601567</ref> Of course, if <em>i</em> is an input, then [[File:APRBp5.png]] does not exist.
  
Once the nets of the neurons are determined, an error phase is run to determine error terms for all neurons ''solely for the purpose of weight modification''. As above, these weight modification error terms are computed using a discrete time approximation to the following equation, iteratively applied to all neurons until the error terms settle to some equilibrium state. Initially set <math>e_j = 0</math> for all neurons.
+
Once the nets of the neurons are determined, an error phase is run to determine error terms for all neurons ''solely for the purpose of weight modification''. As above, these weight modification error terms are computed using a discrete time approximation to the following equation, iteratively applied to all neurons until the error terms settle to some equilibrium state. Initially set [[File:APRBp6.png]] for all neurons.
  
<center><math>\begin{align}
+
[[File:APRBp7.png]]
\frac{\mathrm{d} e_j}{\mathrm{d} t} &= -e_j + \frac{\mathrm{d} \varphi \left ( n_j \right ) }{\mathrm{d} n_j} \sum_{i=1}^N w_{ij} e_i + J_j\\
+
&= -e_j + \varphi \left ( n_j \right ) \left ( 1 - \varphi \left ( n_j \right ) \right ) \sum_{i=1}^N w_{ij} e_i + J_j\\
+
&= -e_j + y_j \left ( 1 - y_j \right ) \sum_{i=1}^N w_{ij} e_i + J_j
+
\end{align}</math></center>
+
  
where <math>J_j</math> is an error term for neurons which are outputs and have targets <math>t_j</math>:
+
where [[File:APRBp8.png]] is an error term for neurons which are outputs and have targets [[File:APRBp9.png]]:
  
<center><math>J_j = t_j - y_j</math></center>
+
[[File:APRBp10.png|center]]
  
 
The weights are then updated according to the following equation:
 
The weights are then updated according to the following equation:
  
<center><math>\Delta w_{ij} = \eta e_j y_i</math></center>
+
[[File:APRBp11.png|center]]
  
where <math>\eta</math> is some small learning rate.
+
where η is some small learning rate.
  
 
==Derivation==
 
==Derivation==
  
The error terms <math>e_j</math> are considered estimates of <math>-\mathrm{d} E / \mathrm{d} n_j</math> during the derivation of the equations for [[feedforward backpropagation]]:
+
The error terms [[File:APRBp12.png]] are considered estimates of [[File:APRBp13.png]] during the derivation of the equations for [[feedforward backpropagation]]:
  
<center><math>\begin{align}
+
[[File:APRBp14.png|center]]
\frac{\partial E }{\partial w_{ij}} &= \frac{\mathrm{d} E }{\mathrm{d} n_j} \frac{\partial n_j}{\partial w_{ij}} \\
+
&= - e_j y_i \\
+
\Delta w_{ij} &= - \eta \frac{\partial E}{\partial w_{ij}} \\
+
&= \eta e_j y_i
+
\end{align}</math></center>
+
  
 
==Objections==
 
==Objections==
Line 67: Line 55:
 
== References ==
 
== References ==
 
<references/>
 
<references/>
 +
 +
[[Category: Neural computational models]]
 +
 +
</translate>

Latest revision as of 02:44, 20 July 2019

Almeida-Pineda recurrent backpropagation is an error-driven supervised learning algorithm for neural networks. It was developed in 1987 by Luis B. Almeida[1] and Fernando J. Pineda.[2][3] It is a supervised learning technique, meaning that the desired outputs are known beforehand, and the task of the network is to learn to generate the desired outputs from the inputs.

As opposed to a feedforward network, a recurrent network is allowed to have connections from any neuron to any neuron in any direction.

Model

Model of a neuron. j is the index of the neuron when there is more than one neuron. The activation function for backpropagation is sigmoidal. Source
A feedforward network. In the Almeida-Pineda model, connections may go from any neuron to any neuron, backwards or forwards. Source


Given a set of k-dimensional inputs with values between 0 and 1 represented as a column vector:

Error creating thumbnail: Unable to save thumbnail to destination

and a nonlinear neuron with (initially random, uniformly distributed between -1 and 1) synaptic weights from the inputs:

Error creating thumbnail: Unable to save thumbnail to destination

then the output y of the neuron is defined as follows:

Error creating thumbnail: Unable to save thumbnail to destination
where
Error creating thumbnail: Unable to save thumbnail to destination
is a sigmoidal function such as that used in ordinary feedforward backpropagation (we will use the logistic function from that page), and n is the net input of the neuron, calculated as follows. Assuming N neurons where k of the neurons are simple inputs to the network, with the weight of the connection from neuron i to neuron j being
Error creating thumbnail: Unable to save thumbnail to destination
, the net
Error creating thumbnail: Unable to save thumbnail to destination
of neuron j (where j is not an input neuron) is computed using a discrete time approximation to the following equation, iteratively applied to all neurons until the nets settle to some equilibrium state. Initially set
Error creating thumbnail: Unable to save thumbnail to destination
to 0 for all non-input neurons.
Error creating thumbnail: Unable to save thumbnail to destination
Note that if the weights between pairs of neurons are symmetric, that is,
Error creating thumbnail: Unable to save thumbnail to destination
, then the network is guaranteed to settle to an equilibrium state.[4] If symmetry is not held, the network will often settle.[5] Of course, if i is an input, then
Error creating thumbnail: Unable to save thumbnail to destination
does not exist. Once the nets of the neurons are determined, an error phase is run to determine error terms for all neurons solely for the purpose of weight modification. As above, these weight modification error terms are computed using a discrete time approximation to the following equation, iteratively applied to all neurons until the error terms settle to some equilibrium state. Initially set
Error creating thumbnail: Unable to save thumbnail to destination
for all neurons.
Error creating thumbnail: Unable to save thumbnail to destination
where
Error creating thumbnail: Unable to save thumbnail to destination
is an error term for neurons which are outputs and have targets
Error creating thumbnail: Unable to save thumbnail to destination
:
Error creating thumbnail: Unable to save thumbnail to destination

The weights are then updated according to the following equation:

Error creating thumbnail: Unable to save thumbnail to destination

where η is some small learning rate.

Derivation

The error terms
Error creating thumbnail: Unable to save thumbnail to destination
are considered estimates of
Error creating thumbnail: Unable to save thumbnail to destination
during the derivation of the equations for feedforward backpropagation:
Error creating thumbnail: Unable to save thumbnail to destination

Objections

While mathematically sound, the Almeida-Pineda model is biologically implausible, like feedforward backpropagation, because the model requires that neurons communicate error terms backwards through connections for weight updates.

References

  1. Almeida, Luis B. (June 1987). "A learning rule for asynchronous perceptrons with feedback in a combinatorial environment." Proceedings of the IEEE First International Conference on Neural Networks
  2. "Generalization of backpropagation to recurrent neural networks". In Anderson, Dana Z. Neural Information Processing Systems Springer (1988). pp. 602-611. ISBN 978-0883185698}}
  3. Pineda, Fernando J. (1989). "Recurrent backpropagation and the dynamical approach to adaptive neural computation". Neural Computation 1: 161-172
  4. Hopfield, J. J. (May 1984). "Neurons with graded response have collective computational properties like those of two-state neurons". Proceedings of the National Academy of Sciences of the United States of America 81: 3088-3092
  5. "Deterministic Boltzmann learning in networks with asymmetric connectivity". In Touretzky, D. S.;Elman, J. L.; Sejnowski, T. J.; Hinton G. E. Connectionist Models: Proceedings of the 1990 Summer School Morgan Kaufmann Publishers (1991). pp. 3-9. ISBN 978-1558601567