# Difference between revisions of "Conditional principal components analysis"

Conditional principal components analysis seeks to restrict neurons to perform principal components analysis only when activated.

## Model Model of a neuron. j is the index of the neuron when there is more than one neuron. For a linear neuron, the activation function is not present (or simply the identity function).
We use a set of linear neurons with binary inputs. Given a set of k-dimensional binary inputs represented as a column vector
Error creating thumbnail: Unable to save thumbnail to destination
, and a set of m linear neurons with (initially random) synaptic weights from the inputs, represented as a matrix formed by m weight column vectors (i.e. a k row x m column matrix):
Error creating thumbnail: Unable to save thumbnail to destination
where
Error creating thumbnail: Unable to save thumbnail to destination
is the weight between input i and neuron j, the output of the set of neurons is defined as follows (but see also the below section on contrast enhancement):
Error creating thumbnail: Unable to save thumbnail to destination

The CPCA rule gives the update rule which is applied after an input pattern is presented:

Error creating thumbnail: Unable to save thumbnail to destination

With a set of such neurons, typically a k-Winner-Takes-All pass is run before the update: all neurons are evaluated, and the k neurons with the highest outputs have their outputs set to 1, while the rest have their outputs set to 0.

## Derivation

We want the weight from input i to neuron j to eventually settle at the probability that neuron j will be activated given that input i is activated. That is, when the weight is at equilibrium, we have:

Error creating thumbnail: Unable to save thumbnail to destination

By the definition of conditional probability:

Error creating thumbnail: Unable to save thumbnail to destination

Using the total probability theorem, we can condition the numerator and denominator on the input patterns. If an input pattern is t, then we have:

Error creating thumbnail: Unable to save thumbnail to destination
Substituting back into the equation for
Error creating thumbnail: Unable to save thumbnail to destination
and doing some rearrangement, we get:
Error creating thumbnail: Unable to save thumbnail to destination

A good assumption is that all input patterns in the set of input patterns are equally likely to appear, so that P(t) is a constant and can be eliminated:

Error creating thumbnail: Unable to save thumbnail to destination

Since inputs and outputs are either 0 or 1, conveniently the average over all patterns of an input or output (or a combination of input and output) will be equal to the probability of that input or output (or combination of input and output) being 1. Thus:

Error creating thumbnail: Unable to save thumbnail to destination

We can easily turn this into an update rule which will drive the weights to the above equilibrium condition:

Error creating thumbnail: Unable to save thumbnail to destination

## Interpretation

Since the inputs and outputs are binary, the update rule can be interpreted as follows:

• If the output is not active, do not alter any weight.
• If the output is active and an input is not active, subtract the weight (times a learning rate).
• If the output is active and an input is also active, add 1 minus the weight (times a learning rate).

The second rule has the effect of driving the weight towards zero (asymptotically), and the third rule has the effect of driving the weight towards one (asymptotically). Overall, the rules combine to equilibrate the weight towards the probability that the output is active given that the input is active.

## k-Winner Takes All

Without some form of competition between neurons fed the same input, neurons may tend to represent the same component of the input. To prevent this, k neurons out of all the neurons have their outputs set to one, and the rest have their outputs set to zero. The k-Winner Takes All algorithm chooses k neurons which have the highest outputs.

## Contrast enhancement

Since weights are initially randomized, some neurons may end up representing weak correlations while others represent strong correlations. In addition, because of the asymptotic nature of the weights, it is difficult for weights to become highly selective (i.e. to get near zero or near one). To fix this problem, the weights are passed through a sigmoid function during computation of the output:

Error creating thumbnail: Unable to save thumbnail to destination

where θ is a parameter indicating the center of the sigmoid (θ=1 sets it at 0.5, lower values set it lower, higher values set it higher), and γ is a parameter indicating how sharp the transition is.