Computer Model Competes with Brain to Recognize Sounds

February 1, 2022

When comparing differences in sounds that reach the right and left ear, the brain estimates the location of the source of that sound. MIT neuroscientists have now developed a computer model that can also perform that task. Consisting of several convolutional neural networks, the model performs as well as humans and struggles in the same ways that humans do.

When we hear a sound, sound waves reach our right and left ears at slightly different times and intensities, depending on what direction the sound is coming from. Parts of the midbrain are specialized to compare these slight differences to help estimate where the sound originated. However, in real-world conditions, the environment produces echoes, and many sounds are heard at once.

The MIT team turned to convolutional neural networks. This kind of computer modeling has been used extensively to model the human visual system, and more recently, McDermott and other scientists have begun applying it to audition as well. They used a supercomputer to train and test about 1,500 different models. That search identified 10 that seemed the best-suited for localization, which the researchers further trained and used for subsequent studies.

The human brain also bases its location judgments on differences in the intensity of sound that reaches each ear. Previous studies have shown that the success of these strategies varies depending on the frequency of the incoming sound. In the new study, the MIT team found that the models showed this same pattern of sensitivity to frequency.