We humans rely on our vision every minute and every second of the day. It's arguably one of our most important senses. So researchers started wondering, could we give computers the ability to think and see? If so, the possibilities would be endless. To answer this question, we first need to know how humans see. How do the signals from our vision process in our brains, and can they be modeled so that computers can genuinely see? These questions and the answers are at the heart of our platform. We will discover all in this blog, derived from our podcast: The Attention Podcast.
If the theory is faultless, it would mean that 100% accurate visual prediction would be possible. Test subjects undergoing eye-tracking sessions to observe marketing material could be made redundant. Eye-tracking glasses monitor your eyes through cameras and can then calculate where a person is looking within their field of view. The marketing campaign below was analyzed using traditional eye tracking with a participant.
However, knowing what a person's eyes are focusing on is not enough because the vision of a subject does not directly lead to attention and thus, perception of that subject. The brain's coping mechanism has limited processing capacity; you could technically look at something while not paying attention to it. Think of all the times you watched a video and had to rewind it because your thoughts were elsewhere, and you did not pay attention to the video.
This entire process, as complicated as it is, has become relatively fast. Since computers have enough computing power, a revolution has started. The processing can be done on a GPU (graphics processing unit), also known as a graphics card, found in many computers. The original use case of the GPU was video gaming, but the mathematics needed for convolutional neural networks are the same as for video games, both are based on linear algebra.