Neural Turing Machines

|
|

We extend the capabilities of neural networks by coupling them to external memory resources, which they can interact with by attentional processes. The combined system is analogous to a Turing Machine or Von Neumann architecture but is differentiable end-to-end, allowing it to be efficiently trained with gradient descent. Preliminary results demonstrate that Neural Turing Machines can infer simple algorithms such as copying, sorting, and associative recall from input and output examples.

Adding large, addressable memory to a standard recurrent neural network. This acts as a “working memory” more than a longterm memory storage. The neural net uses to learn the memory rather than having a fixed set of procedures.

The read/write operations are “blurry” and the degree of blurriness is determined by an attentional “focus” mechanism that allows for ignoring other parts of memory.

Memory addressing (“focusing”) is done by content (produce and approximation and yield the exact stored value) and by location (for math-like problems we need to store X and Y and recall any arbitrary X and Y).