Conditional principal components analysis

From Eyewire
Revision as of 20:58, 23 June 2014 by DannyS (Talk | contribs)

Jump to: navigation, search

Conditional principal components analysis seeks to restrict neurons to perform principal components analysis only when activated.[1]

Model

Model of a neuron. j is the index of the neuron when there is more than one neuron. For a linear neuron, the activation function is not present (or simply the identity function).

We use a set of linear neurons with binary inputs. Given a set of k-dimensional binary inputs represented as a column vector Hebb1.png, and a set of m linear neurons with (initially random) synaptic weights from the inputs, represented as a matrix formed by m weight column vectors (i.e. a k row x m column matrix):

Sanger1.png

where Sanger2.png is the weight between input i and neuron j, the output of the set of neurons is defined as follows (but see also the below section on contrast enhancement):

Sanger3.png

The CPCA rule gives the update rule which is applied after an input pattern is presented:

CPCA1.png

With a set of such neurons, typically a k-Winner-Takes-All pass is run before the update: all neurons are evaluated, and the <math>k</math> neurons with the highest outputs have their outputs set to 1, while the rest have their outputs set to 0.

Derivation

We want the weight from input i to neuron j to eventually settle at the probability that neuron j will be activated given that input i is activated. That is, when the weight is at equilibrium, we have:

CPCA2.png

By the definition of conditional probability:

CPCA3.png

Using the total probability theorem, we can condition the numerator and denominator on the input patterns. If an input pattern is t, then we have:

CPCA4.png

Substituting back into the equation for Sanger2.png and doing some rearrangement, we get:

CPCA5.png

A good assumption is that all input patterns in the set of input patterns are equally likely to appear, so that P(t) is a constant and can be eliminated:

CPCA6.png

Since inputs and outputs are either 0 or 1, conveniently the average over all patterns of an input or output (or a combination of input and output) will be equal to the probability of that input or output (or combination of input and output) being 1. Thus:

CPCA7.png

We can easily turn this into an update rule which will drive the weights to the above equilibrium condition:

CPCA8.png

Interpretation

Since the inputs and outputs are binary, the update rule can be interpreted as follows:

  • If the output is not active, do not alter any weight.
  • If the output is active and an input is not active, subtract the weight (times a learning rate).
  • If the output is active and an input is also active, add 1 minus the weight (times a learning rate).

The second rule has the effect of driving the weight towards zero (asymptotically), and the third rule has the effect of driving the weight towards one (asymptotically). Overall, the rules combine to equilibrate the weight towards the probability that the output is active given that the input is active.

k-Winner Takes All

Without some form of competition between neurons fed the same input, neurons may tend to represent the same component of the input. To prevent this, k neurons out of all the neurons have their outputs set to one, and the rest have their outputs set to zero. The k-Winner Takes All algorithm chooses k neurons which have the highest outputs.

Contrast enhancement

Since weights are initially randomized, some neurons may end up representing weak correlations while others represent strong correlations. In addition, because of the asymptotic nature of the weights, it is difficult for weights to become highly selective (i.e. to get near zero or near one). To fix this problem, the weights are passed through a sigmoid function during computation of the output:

CPCA9.png

where θ is a parameter indicating the center of the sigmoid (θ=1 sets it at 0.5, lower values set it lower, higher values set it higher), and γ is a parameter indicating how sharp the transition is.

References

  1. O'Reilly, Randall C.; Munakata, Yuko (2000). Computational Explorations in Cognitive Neuroscience: Understanding the Mind by Simulating the Brain ISBN 978-0262650540