Jan
13
2010

Learning object recognition by a self organising process

This is what I was researching for my 2nd lab rotation for the MSc under the supervision of Dr. Simon Stringer. It was a computational project with an emphasis on a proof of principle that the independent motion of objects in natural scenes allows us to learn them as individual objects (that is, form separate representations of them in our brain), even if they are always seen together.

I’ll briefly describe how the computational model works. We have 2 sets of arrays, an input and an output, where each element represents a neuron, the element’s value representing it’s firing rate (between 0 and 1). The input array is formulated as a ring of 75 neurons and there are two of these, one for each object. A block of 20 contiguous neurons in each ring are switched on which represent the object. This packet of activity can be moved around the ring to simulate different views of the same object.

These input arrays are connected to the output array, a 50×50 sheet of neurons, such that each input neuron is connected to every output neuron. The strength of these connections are described by the synaptic weight (a synapse is a connection between two neurons). The synaptic weight between one input and one output neuron evolves according to the activity of these neurons. Thus, if both neurons have a high firing rate, the weight will increase a lot. If one or both have a low firing rate, the weight will decrease. So the weight only increases if the input neuron actively drives the output neuron.

Furthermore, there are local excitatory connections in the output layer such that neurons near each other help to drive each other. This means that if an input neuron actively drives an output neuron then it will likely also drive other nearby output neurons. Consequently, its synaptic weight to these other neurons will also likely increase.

How do we train the output layer to recognise the objects? If we present the objects, via the input neurons, to the output layer, in many different positions on the ring many times (this allows the synaptic weights to evolve lots) then different parts of the output layer will eventually respond only to “preferred” positions on the ring. This is because the output layer also has global inhibition - that is, the more one neuron fires, the harder it is for another to fire. This essentially acts to drive competition between neurons so that they don’t all respond to the same input as this would be redundant. This type of competition is present in real brains. If it wasn’t, we would only be able to form a single memory which isn’t very useful!

These local excitatory connections provide a basis for self organisation - that is, neurons near each other respond similarly to each other (to similar inputs) and those further apart respond to different inputs (in this case, different “views” of the object).

Now, what is interesting is that if we move both objects around their respective rings in exactly the same way, allowing the synaptic weights to evolve, the output layer develops exactly the same response for one object as for the other. What does this mean? Let’s say we test the output layer by presenting it with just one of the two objects (remember, both are presented when we train the output layer). How can we tell which object is being presented if the output neurons respond in the same way to this object as to the other? Very simply, we can’t. The output layer thinks that both objects are the same object.

Now, if we train the output layer on objects that do not move around their rings in the same way, then we reduce the symmetry between the objects. They become “statistically decoupled” from each other from the point of view of the output layer. As such, the output layer is able to form one set of responses for one object and another set of responses for the other object. Now, when we come to test the output layer with just one object, we can tell which object is being presented based on the pattern of activity in the output layer. And so the output layer has learnt to recognise each individual object.

This phenomena can help us to understand many other behavioural attributes that rely on vision - in fact, it applies to all sensory modalities (touch, sound, taste, smell) as the important aspect of the inputs is that they’re statistically decoupled.

My supervisor and I are hoping to publish these results as a proof of principle for other, more applied work going on in the lab.

Written by admin in: Uncategorized |

1 Comment

  • Я думаю, что Вы не правы. Я уверен. Давайте обсудим. Пишите мне в PM, поговорим….

    This is what I was researching for my 2nd lab rotation for the MSc under the supervision of Dr. Simon Stringer…..

    Trackback | March 23, 2010

RSS feed for comments on this post. TrackBack URL

Sorry, the comment form is closed at this time.

Powered by WordPress | Aeros Theme | TheBuckmaker.com WordPress Themes