Click to edit Master text styles
Second level
Third level
Fourth level
Fifth level
This slideshow tutorial is Appendix 1 of Shultz and Bale (2001). Before proceeding with this slideshow, it is recommended that you watch the slideshow A Tutorial on Cascade-correlation, if you have not yet seen that more general introduction.
Encoder networks can implement a kind of recognition memory, as would be appropriate for simulation of habituation and familiarization studies. If an encoder network can learn to encode a stimulus onto a small number of hidden units and then decode this hidden unit representation onto its output units with very little error, then it has recognized the stimulus as familiar. The essential change from ordinary cascade-correlation (CC) is the elimination of any direct input-to-output unit connection weights. If such direct connections are retained, then learning an encoder problem is trivial, requiring only a weight of 1.0 between each input unit and its corresponding output unit. With such a trivial solution, a network learns nothing useful that could enable such phenomena as pattern completion or prototype abstraction.
As usual, CC training begins in an output phase with no hidden units. But with the encoder option, the only available weights are from the bias unit. Trainable connection weights are drawn in this slideshow as dashed arrows. Initially, the weights have random values, generating random performance. Weights are adjusted to reduce discrepancy (error) between the input and output vectors. Error reduction typically stagnates quickly in this first output phase with the encoder option because the network is taking no account of variation in input patterns. The bias unit always has an input activation of 1.0, regardless of the input pattern being presented. With trainable connection weights to all downstream units, the bias unit implements a learnable resting activation level for each downstream unit.
With the exception of the lack of direct input-output connections, training proceeds as in normal cascade-correlation. When error reduction stagnates, a hidden unit is recruited. As the first hidden unit is added, its input weights are frozen (shown in solid arrows), and training of the output weights resumes. The network is growing as it learns.
A second hidden unit, if required, is installed downstream of the first hidden unit. After each hidden unit is recruited, training of output weights resumes. With a relatively small number of hidden units, an encoder network is forced to achieve a relatively compact, and thus abstract, representation of the inputs. Inputs are encoded onto this abstract hidden-unit representation using input-side weights. The hidden unit representation is then decoded onto the output units using output-side weights. Because the discrepancy between input and output activations constitutes network error, there is a sense in which encoder networks do not require any external feedback other than the training inputs.
Shultz, T. R., & Bale, A. C. (2001). Neural network simulation of infant familiarization to artificial sentences: Rule-like behavior without explicit rules and variables. Infancy, 2, 501-536.