Difference between revisions of "Almeida-Pineda recurrent backpropagation"

From Eyewire
Jump to: navigation, search
(Objections)
Line 32: Line 32:
 
Note that if the weights between pairs of neurons are symmetric, that is, <math>w_{ij} = w_{ji}</math>, then the network is guaranteed to settle to an equilibrium state.<ref>{{cite journal|last=Hopfield|first=J. J.|title=Neurons with graded response have collective computational properties like those of two-state neurons|journal=Proceedings of the National Academy of Sciences of the United States of America|volume=81|pages=3088-3092|date=May 1984|url=http://www.pnas.org/content/81/10/3088.full.pdf}}</ref> If symmetry is not held, the network will often settle.<ref>{{cite book|chapter=Deterministic Boltzmann learning in networks with asymmetric connectivity|title=Connectionist Models: Proceedings of the 1990 Summer School|editors=Touretzky, D. S.;Elman,  J. L.; Sejnowski, T. J.; Hinton G. E.|pages=3-9|year=1991|publisher=Morgan Kaufmann Publishers|isbn=978-1558601567}}</ref> Of course, if <math>i</math> is an input, then <math>w_{ji}</math> does not exist.
 
Note that if the weights between pairs of neurons are symmetric, that is, <math>w_{ij} = w_{ji}</math>, then the network is guaranteed to settle to an equilibrium state.<ref>{{cite journal|last=Hopfield|first=J. J.|title=Neurons with graded response have collective computational properties like those of two-state neurons|journal=Proceedings of the National Academy of Sciences of the United States of America|volume=81|pages=3088-3092|date=May 1984|url=http://www.pnas.org/content/81/10/3088.full.pdf}}</ref> If symmetry is not held, the network will often settle.<ref>{{cite book|chapter=Deterministic Boltzmann learning in networks with asymmetric connectivity|title=Connectionist Models: Proceedings of the 1990 Summer School|editors=Touretzky, D. S.;Elman,  J. L.; Sejnowski, T. J.; Hinton G. E.|pages=3-9|year=1991|publisher=Morgan Kaufmann Publishers|isbn=978-1558601567}}</ref> Of course, if <math>i</math> is an input, then <math>w_{ji}</math> does not exist.
  
Once the nets of the neurons are determined, an error phase is run to determine the outputs of the non-input neurons ''solely for the purpose of weight modification''. As above, these weight modification outputs are computed using a discrete time approximation to the following equation, iteratively applied to all neurons until the outputs settle to some equilibrium state. Initially set <math>y_j = \varphi \left ( n_j \right )</math> for all non-input neurons.
+
Once the nets of the neurons are determined, an error phase is run to determine error terms for all neurons ''solely for the purpose of weight modification''. As above, these weight modification error terms are computed using a discrete time approximation to the following equation, iteratively applied to all neurons until the error terms settle to some equilibrium state. Initially set <math>e_j = 0</math> for all non-input neurons.
  
 
<center><math>\begin{align}
 
<center><math>\begin{align}
\frac{\mathrm{d} y_j}{\mathrm{d} t} &= -y_j + \frac{\mathrm{d} \varphi \left ( n_j \right ) }{\mathrm{d} n_j} \sum_{i=1}^N w_{ij} y_i + J_j\\
+
\frac{\mathrm{d} e_j}{\mathrm{d} t} &= -e_j + \frac{\mathrm{d} \varphi \left ( n_j \right ) }{\mathrm{d} n_j} \sum_{i=1}^N w_{ij} e_i + J_j\\
&= -y_j + \varphi \left ( n_j \right ) \left ( 1 - \varphi \left ( n_j \right ) \right ) \sum_{i=1}^N w_{ij} y_i + J_j
+
&= -e_j + \varphi \left ( n_j \right ) \left ( 1 - \varphi \left ( n_j \right ) \right ) \sum_{i=1}^N w_{ij} e_i + J_j
 
\end{align}</math></center>
 
\end{align}</math></center>
  
where <math>y_i = x_i</math> when <math>i</math> is an input neuron, and <math>J_j</math> is an error term for neurons which are outputs and have targets <math>t_j</math>:
+
where <math>J_j</math> is an error term for neurons which are outputs and have targets <math>t_j</math>:
  
<center><math>J_j = t_j - y_j</math></center>
+
<center><math>J_j = t_j - \varphi \left ( n_j \right )</math></center>
  
 
The weights are then updated according to the following equations (compare to the derivation given for [[feedforward backpropagation]]):
 
The weights are then updated according to the following equations (compare to the derivation given for [[feedforward backpropagation]]):

Revision as of 02:17, 14 April 2012

Almeida-Pineda recurrent backpropagation is an error-driven learning technique developed in 1987 by Luis B. Almeida[1] and Fernando J. Pineda.[2][3] It is a supervised learning technique, meaning that the desired outputs are known beforehand, and the task of the network is to learn to generate the desired outputs from the inputs.

As opposed to a feedforward network, a recurrent network is allowed to have connections from any neuron to any neuron in any direction.

Model

Model of a neuron. j is the index of the neuron when there is more than one neuron. The activation function for backpropagation is sigmoidal.
File:Artificial neural network.svg
A feedforward network. In the Almeida-Pineda model, connections may go from any neuron to any neuron, backwards or forwards.


Given a set of k-dimensional inputs represented as a column vector:

<math>\vec{x} = [x_1, x_2, \cdots, x_k]^T</math>

and a nonlinear neuron with (initially random, uniformly distributed between -1 and 1) synaptic weights from the inputs:

<math>\vec{w} = [w_1, w_2, \cdots, w_k]^T</math>

then the output <math>y</math> of the neuron is defined as follows:

<math>\begin{align}

y &= \varphi \left ( n \right )

\end{align}</math>

where <math>\varphi \left ( \cdot \right )</math> is a sigmoidal function such as that used in ordinary feedforward backpropagation (we will use the logistic function from that page), and <math>n</math> is the net input of the neuron, calculated as follows. Assuming <math>N</math> neurons where <math>k</math> of the neurons are simple inputs to the network, with the weight of the connection from neuron <math>i</math> to neuron <math>j</math> being <math>w_{ij}</math>, the net <math>n_j</math> of neuron <math>j</math> (where <math>j</math> is not an input neuron) is computed using a discrete time approximation to the following equation, iteratively applied to all neurons until the nets settle to some equilibrium state. Initially set <math>n_j</math> to 0 for all non-input neurons.

<math>\frac{\mathrm{d} n_j}{\mathrm{d} t} = -n_j + \sum_{i=1}^N w_{ij} \begin{cases}

\varphi \left ( n_i \right ) & \text{ if } i \text{ is not an input } \\ x_i & \text{ if } i \text{ is an input }

\end{cases}</math>

Note that if the weights between pairs of neurons are symmetric, that is, <math>w_{ij} = w_{ji}</math>, then the network is guaranteed to settle to an equilibrium state.[4] If symmetry is not held, the network will often settle.[5] Of course, if <math>i</math> is an input, then <math>w_{ji}</math> does not exist.

Once the nets of the neurons are determined, an error phase is run to determine error terms for all neurons solely for the purpose of weight modification. As above, these weight modification error terms are computed using a discrete time approximation to the following equation, iteratively applied to all neurons until the error terms settle to some equilibrium state. Initially set <math>e_j = 0</math> for all non-input neurons.

<math>\begin{align}

\frac{\mathrm{d} e_j}{\mathrm{d} t} &= -e_j + \frac{\mathrm{d} \varphi \left ( n_j \right ) }{\mathrm{d} n_j} \sum_{i=1}^N w_{ij} e_i + J_j\\ &= -e_j + \varphi \left ( n_j \right ) \left ( 1 - \varphi \left ( n_j \right ) \right ) \sum_{i=1}^N w_{ij} e_i + J_j

\end{align}</math>

where <math>J_j</math> is an error term for neurons which are outputs and have targets <math>t_j</math>:

<math>J_j = t_j - \varphi \left ( n_j \right )</math>

The weights are then updated according to the following equations (compare to the derivation given for feedforward backpropagation):

<math>\begin{align}

\frac{\partial E }{\partial w_{ij}} &= - y_j \frac{\mathrm{d} y_j }{\mathrm{d} n_j} \frac{\partial n_j}{\partial w_{ij}} \\ &= - y_j \frac{\mathrm{d} \varphi }{\mathrm{d} n_j} y_i \\ &= - y_j \varphi \left ( n_j \right ) \left ( 1 - \varphi \left ( n_j \right ) \right ) y_i \\ \Delta w_{ij} &= - \eta \frac{\partial E}{\partial w_{ij}} \\ &= \eta y_j \varphi \left ( n_j \right ) \left ( 1 - \varphi \left ( n_j \right ) \right ) y_i

\end{align}</math>

where <math>\eta</math> is some small learning rate.

Objections

While mathematically sound, the Almeida-Pineda model is biologically implausible since it uses a net activation and a separate output activation which are not simply related. In addition, like feedforward backpropagation, the model requires that neurons communicate backwards through connections for weight updates.

References

  1. Script error: No such module "Citation/CS1". Paywalled.
  2. Template:Cite book
  3. Script error: No such module "Citation/CS1".
  4. Script error: No such module "Citation/CS1".
  5. Template:Cite book