How Visual Cortex inspired the Convolutional Neural Networks

By Spandan Sureja - July 21, 2023

Introduction

Since the 1960s, after the successful development (at least with respect to the existing technologies of that time) of the first perceptron by Frank Rosenblatt, researchers around the world have been trying to make computers more brain-like. In other words, they are constantly trying to develop programs and hardware that work much like our brains. However, creating a machine that thinks and programs like a human brain is still impossible, given the limitations of present technologies. But, we are successful in this direction up to an agreeable extent. The perceptron developed by Rosenblatt can be considered one of the first implementations of neural networks. It was a huge hardware model, rather than a Python program like today's perceptrons.

Rosenblatt's perceptron was a simple binary input-output system, with various limitations. It was not trainable for different patterns, rather but for simple image recognization. The major drawback was its inability to perform backpropagation (a crucial part of today's neural networks). This was, however, because of the fact that computers of that era were unable to perform floating point multiplication, and hence cannot perform any differentiation or gradient operation. However, after the 1980s, with even more highly computable computers, the theory of backpropagation emerged.

Researchers again were optimistic about creating human brain-like computers. Now, neural networks can not only be trainable but their error matrices and automatic weight modifications can also be performed. This new implementation of statistical and mathematical programmable modules to create brain-like programs form the base ground for Convolutional Neural Networks. CNNs are widely used for image classification, object detection, facial recognition, and natural language processing. All such applications have made possible today's self-driving cars, language translations, biometric locks, and numerous other technologies.

What inspired CNNs?

The part of the brain responsible for processing visual information, called the visual cortex was the inspiration behind CNNs. The visual cortex is an important part of our brain that, as mentioned earlier, generates visual information using information provided by the retina. The working of the human brain is indeed a work of art created by nature and its shows how the process of evolution led to the creation of such a complex biological system.

Eyes are the doorways through which we see the world. The light rays enter the eyes and reach the retina. The retinal nerve fiber then passes the signal to the brain. The human eye has a marvelous architecture. Our eyes have something around 100 million pixels. It's like a camera with 100 million pixels! But the problem is, we cannot have 100 million fibers coming out of our eyes, otherwise, our optical nerve would be as big as our neck! Consider this a flaw of evolution or a limitation in the functionality of our neural system.

To overcome this limitation (or flaw), the neurons in front of the retina will do a sort of compression (not similar to image compression, obviously). This allows signals to be compressed to one million fibers. Hence, around one million fibers come out from each of our eyes. Then the signal goes to a small portion of the brain called the lateral geniculate nucleus. It does some sort of normalization and then sends the processed signal to the area called V1 of the visual cortex. Then there is a sort of ventral hierarchy called V2, V3, and V4 (these are brain areas going from the back of the brain to the side). The signal passes to all these layers. Finally, our brain will able to perform object categorization, which enables us to see, interpret and identify different objects.

Now if your look closely, this is quite similar to the working of a CNN. The CNN also takes large pixels of input data, compresses them, passes them to the pooling layer, normalizes them, and finally generates output through various hidden layers. Hence, the working of the visual cortex clearly inspired the development of CNN. Now, what actually inspired CNN?

Implementing Visual Cortex model in CNN

1. Hierarchical processing:

In the visual cortex, information is processed in a hierarchical manner, where lower-level neurons respond to simple features such as edges, corners, and gradients, while higher-level neurons respond to more complex features and object representations. This hierarchical organization allows for the construction of feature hierarchies in deep neural networks (including CNNs), where each layer learns to represent increasingly abstract features of the input data.

2. Weight sharing:

The visual cortex exhibits weight sharing, where the same features learned by neurons at a certain location are applied across the entire visual field. This idea is embodied in CNNs as well, where the same set of learned filters (weights) is applied to different regions of the input image, enabling the network to learn and generalize better.

3. Local receptive fields:

Neurons in the visual cortex have localized receptive fields, meaning they are sensitive to specific regions of the visual field. This concept has been incorporated into CNNs, where convolutional layers utilize small local filters to scan the input image, allowing the network to capture local patterns and spatial relationships effectively.

4. Sparse connectivity:

In the visual cortex, not all neurons are connected to every other neuron. Similarly, in deep neural networks, particularly CNNs, convolutional layers have sparse connectivity due to the use of local filters, which helps reduce the computational complexity and makes learning more efficient.

5. Feedforward processing:

Visual information in the brain follows a mostly feedforward pathway, where signals flow from the retina to higher visual areas for processing. CNNs are also designed as feedforward architectures, which simplifies the training process and enables efficient computation on parallel hardware.

6. Pooling Mechanisms:

Inspired by the pooling of visual information in the brain, pooling layers in CNNs aggregate and summarize information, reducing spatial dimensions while retaining relevant features.

7. Unsupervised Learning:

The visual cortex learns efficiently in an unsupervised or self-supervised manner by analyzing the correlations in natural visual inputs. This unsupervised learning paradigm has influenced the development of unsupervised pretraining techniques for CNNs.

Search This Blog

SciGaze Group