Task-Driven Learning of Contour Integration Responses in a V1 Model
December 16, 2020
December 12, 2020
Salman Khan, University of Waterloo
All Captioned Videos SVRHM Workshop 2020
PRESENTER: Saul Mendez from the University of Waterloo. A group of theoretical neuroscience, I believe, as well. And not just his paper but the other four papers and the next presenter-- not only high scoring papers, but all papers at least four to 9 from the three reviews. So once again, Salman Khan, Alexander Wong, Bryan Tripp.
SALMAN KHAN: Hi, everyone. And thank you for giving me this opportunity. So my name is Salman Khan, and today I will be presenting our work on task-driven learning of contour integration responses in a V1 model. OK. Recently deep neural networks have surpassed human level performance on several high level visual tests. However, the brain's visual system is much more robust.
One key difference is that in difficult viewing conditions, the brain uses a variety of contextual modulation techniques to augment its core feed-forward signal. One such technique is contour integration. And the brain uses it to pop out smooth contours from the backup. Now contour integration has been known about for a long time. Many properties have been discovered, and many models have been proposed in the literature. However, apart from some very recent work, for example the HQ in the government models, many of these investigations have been done in isolation and using targeted synthetic stimuli. Similar to this line of work, we focus on the role of contour integration on high level visual tasks. However, different from their work, we focus more on replicating neurophysiological properties of contour integration and on how brain-like contour integration can be learned from our natural viewing environment.
And now, before contour integration can be recommended as a general robustness improving technique, it needs to be completely understood. And as a step in this direction, we propose to embed a contour integration model inside an artificial neural network and investigate how it impacts high level visual tasks.
So what exactly is contour integration. Contour integration was first observed psychophysically as the popping out of small line segments that followed smooth trajectories in the presence of distractors. For example, in the top figure, centrally located line segments are arranged in a straight line. These line segments are visually more salient than the other line segments, which are identical. This pop out effect is also observed for curved contours, as shown in the bottom figure. Here, the egg shape sticks out from the background.
Next, it was a neurophysiologically that neurons in the V1 cortex, whose receptive fields overlapped a line, fragment showed enhanced responses. Now, the individual receptive fields of these neurons were only big enough to include a single fragment, and the actual control extended well outside this area.
An important set of properties of contour integration was discovered by Lee et al. in a previous study. Here, they simultaneously measured behavioral responses and responses of V1 neurons to contours embedded in a sea of distractors. Two parameters retained, contour lengths and the spacing between fragments. They found a high degree of correlation between behavioral and neurophysiological responses. So as lengths increased, both responses increased, while as the spacing between fragments increase, both responses decreased. The goal of our work is to see if our model learns these same properties when trained on high level visual tasks.
OK. So the model we use. Here's a high level block diagram of our model. The most important part is the contour integration layer. It sits on top of a pretained edge extraction layer and models lateral interactions between component nodes. So in the brain, in addition to feed forward connections, there are many lateral and feedback connections. It is thought that these secondary connections are responsible for contour integration. And with this layer, we tried to model these lateral interactions.
To build this model, we took an existing model of cultural integration, modified it to work in a neural network framework, and embedded it into a larger task-driven network. We included all the connections from the original model but allowed their parameters to be learned especially the lateral connection structures.
The output of the contour integration layer is paths to classification blocks, which mapped them to desired label dimensions. We compared the model with a parameter match feed-forward control, which consisted of identical convolutional layers, but which were sequentially arranged in a feed-forward model.
Finally, for each considered task, neurophysiological results were measured at the output of the contour integration or controller. And behavioral performance was measured at the output of the whole model.
So as the first task, we use stimuli typically used to investigate contour integration. These consisted of contour fragments embedded in a sea of identical distractors. The orientation of contour fragments were aligned to form a smooth contour, while the remaining background fragments had random orientation. Embedded contours differed in their locations, orientation, lengths inter-fragment rotations, and the gabor parameters used to construct fragments.
The model was tasked with identifying which fragments were contour fragments and which were background fragments. In total, the training data set contained 64,000 training images and 6,400 validation images. And some of the examples stimuli are shown in the figure.
After training, the model were tested for consistency with behavioral and neurophysiological data, with straight contours different lengths and different inter-fragment spacing. To hear the results of our first experiment, mean intersection of reunion scores of both the model and control are shown in the table. At the behavioral level, the model performed about 7% better than the control. When only straight contours are considered, the model and control were much closer, with the model performing on average 4% better. This can also be seen in figure a, where average intersection of reunion scores for contours of different lengths are plotted.
A much sharper contrast between the model and control were seen when neurophysiological results were analyzed. Neurophysiological results were quantified using enhancement gains of monitored neurons. This was defined as the ratio of average response to the contour condition, and the average response when only the optimal stimulus was present in the classical receptive field. Gains, as a function of contour length, are shown in figure b.
Similar to measured gains, model gains increase with length. Control gains were more variable. For some lengths, gains increased, while for others they decreased. Enhancement gains and spacing between fragments increased as shown in figure c. Here again, the model followed the same trend as measured gains and decreased with spacing. Contrastingly, control gains increased as spacing between fragments increased.
In summary, the behavioral predictions of the model and control are comparable. However, the model and control appear to be employing different strategies to solve the task. And only the model aligned with the neurophysiology.
So next to investigate how contour integration can be learned from our natural viewing environment, we created a new task on natural images. We randomly selected two smooth contours using the edge labels from an edge detection data set. We then added two easily identifiable markers to the contours in the image. In some images, markers are placed on the same contour, while in others they were on different contours.
Next, to fragment contours, we added random occlusion bubbles. The model was tasked with determining if the two markers were connected by a smooth edge. The data set contained 50,000 training contours and 5,000 validation contours. After training, we tested for consistency with behaviorial and neurophysiological data. Different from training images, in the test images, bubbles replaced and fixed locations along contours to simulate fixed into fragments facing different sizes and separation distances were used to simulate various inter-fragments distances.
So here are the results of our experiment. Classification accuracies as shown in the table. The model performed approximately 11% than the control. Accuracies, as the spacing between fragments were changed, are shown in figure a. For both the model and the control, accuracies dropped as spacing increased, consistent with observed behavioral trends. What was really noticeable was the drop in accuracy when switching from training to test data. For the model, the accuracy dropped by 7%, but for the control the drop was about 14%. This shows that the model generalized better.
We next looked at the neurophysiological results. Figure c shows the histogram of gradient of linear fits to output activations, versus fragments facing curves of monitored neurons in the contour integration layer. In figure d, we plotted similar curves for the control model. Compared with the control, model outputs dropped more sharply as the fragment spacing increased, which is more consistent with neurophysiology.
So in conclusion, brain-like contour integration can be learned in artificial neural networks, however these models need to be architecturally constrained. Our contour integration model matched empirically measured results at both the neural physiological and behavioral level. This was not the case for a parameter matched feed-forward control. Our results also show that the model had slightly higher performance than the control and that it was better at generalizing to data outside what it was trained with.
Thank you for listening. And for more details, please refer to our paper.