People can recognize objects despite changes in their visual appearance that stem from changes in viewpoint. Looking at a television set, we can follow the action displayed on it even if we don’t look straight at it, if we sit closer than usual, or if we are lying sideways on a couch. The object identity is thus invariant to simple transformations of its visual appearance in the 2-D plane such as translation, scaling and rotation. There is experimental evidence for such invariant representations in the brain, and many competing theories of varying biological plausibility that try to explain how those representations arise. A recent paper detailing a biologcally plausible algorithmic model of this phenomenon is the result of a collaboration between Brandeis Neuroscience graduate student Pavel Sountsov, postdoctoral fellow David Santucci and Professor of Biology John Lisman.
Many theories of invariant recognition rely on the computation of spatial frequency of visual stimuli using the Fourier transform. This, however, is problematic from a biological realism standpoint, as the Fourier transform requires the global analysis of the entire visual field. The novelty of the model proposed in the paper is the use of a local filter to compute spatial frequency. This filter consists of a detector of pairs of parallel edges. It can be implemented in the brain by multiplicatively combining the activities of pairs of edge detectors that detect edges of similar orientations, but in different locations in the visual field. By varying the separation of the receptive fields of those detectors (thus varying the separation of the detected edges), different spatial frequencies can be detected. The model shows how this type of detector can be used to build up invariant representations of visual stimuli. It also makes predictions about how the activity of neurons in higher visual areas should depend on the spatial frequency content of visual stimuli.
Sountsov P, Santucci DM, Lisman JE. A Biologically Plausible Transform for Visual Recognition that is Invariant to Translation, Scale, and Rotation. Frontiers in computational neuroscience. 2011;5:53.