Story 1 - 27/9/2012
Thinking at the speed of light may soon acquire new meaning — inspired by how the brain processes information, researchers present an optical system capable of recognizing spoken words.
Now also in Spanish
The light and the brain. The combination of the speed of light and the interconnectivity of the brain promises to deliver novel and more powerful computational devices that might soon be employed to complement the capabilities of today’s computers.
brought to you by
Optics and Photonics Latin America
What do a laser and the human brain have in common? “Not a lot”, we may be tempted to say when first faced with the question — but how wrong we would be! In fact, both the human brain and a laser can be used to manipulate information in a highly sophisticated way. A team of researchers led by Laurent Larger at the University of Franche-Comté in France has been able to use a laser to recognize spoken digits from “0” to “9.” This feat was achieved by using a clever optical system to pre-process the audio signal, and then a computer to do the triage. This is one of the first experimental demonstrations of a new computational paradigm to achieve a powerful pattern recognition tool.
In the old days, computer programs were saved on punch cards and instructions were represented by holes in a program card read by the machine. In subsequent decades, the medium containing the program evolved from large magnetic bands, to cassettes, diskettes, compact disks and, eventually, to today’s flash drives. However, the underlying programming paradigms have essentially remained the same: programs have typically been a deterministic, rigid set of instructions.
In recent years, a new paradigm has increased in popularity, inspired by the human brain — that of programs that can learn how to best solve a given task. The human brain is home to a fascinatingly complex array of interconnected neurons forever exchanging signals, implementing instances of feedback and optimizing connections. Chaotic? Maybe so. Nevertheless, it is nature’s elegant solution to everyday problems like face recognition, gesture interpretation, or understanding language. And it is also what has influenced this new programming approach.
The experiment performed by Larger and coworkers implements an idea that originated from collaborations between two research communities: one studying the brain and the other its neural networks. They have attempted to mimic more closely the information processes that occur in the brain and they did this by exploiting the complex response of an excitable system to different inputs.
Consider for a moment the implications in trying to make a computer distinguish between a spoken “0” and a spoken “9.” The sound waves produced for “0” are clearly different from those produced for “9” and in the traditional approach computer scientists have tried to get the computers to distinguish between the two sounds by having them compare the sounds perceived to sounds stored in the machines’ database. However, the spoken “0” will probably never be exactly the same as the “0” in the database: pitch, pace, accent and background noise will all influence the recording. Therefore, it is important for a computer to learn what the different numbers sound like by being exposed to a number of examples. Rigid programming as in the punch-card days will not do.
Today’s photonic brain. The picture shows an improved version of the current device used by Larger’s team to recognize spoken digits from 0 to 9. Due to the bulky electronics, it takes the current version of the device around 20 milliseconds to recognize each digit. However, by using state-of-the-art integrated optoelectronic technologies, already standard in the telecommunications industry, it will be possible to recognize a digit in a fraction of a microsecond. Picture credit: Laurent Larger.
In essence, to best distinguish between the numbers, the researchers studied not the sound waves, but the response of a complex optical system to these sound waves. In this way, their experimental setup was able to cause the computer to decipher sounds by focusing on various relevant aspects of the sound wave produced. This was achieved by mapping their signal onto a far more complex signal via their optical system, which made it easier for the learning algorithm to distinguish the numbers.
More concretely, their approach looks as follows: a very complex system is excited with the sound wave associated with “0.” This galvanizes the entire system and various different connections are activated. Some portions of the system light up and other portions dim down. A series of feedback instances takes place. One way to visualize this could be as a network of wires with pulsating lights. This might appear to be almost random, but in fact the characteristics of this excitation are determined by the features of the input signal. Therefore, for each kind of input a characteristic excitation will take place and by reading this, it is possible to reconstruct the input.
Larger and coworkers have employed an optical system with feedback as their excitable system. Their computational device employs a laser whose color can change depending on a control parameter. The light from the laser passes through a birefringent prism, and is split into two beams thus producing an interference pattern. This resulting interference pattern depends on the color of the laser and it is used to determine the value of a very complex function with a series of delayed feedbacks that, in its turn, determines the control parameter of the laser. This type of system responded in completely different ways when excited with the waves corresponding to digits from “0” to “9.” By recording the response of the system, therefore, it was possible to classify correctly the digits.
Today, the system proposed by Larger and coworkers is already capable of achieving results that are comparable to the best computational devices available for voice recognition, even though it is still rather slow: it takes about 20 milliseconds to recognize a digit. This shortcoming is due to the hardware the team is currently using. With better, state-of-the-art optoelectronic components, Larger is confident that they will be able to speed the system up to the point where it can recognize a digit in under a microsecond.
As this research shows, it is sometimes worthwhile to shift pattern recognition from studying the direct recorded signal, to studying the response of a very complex system to that signal.
Devices that learn how to solve a problem and recognize a pattern may sound very futuristic, but they are being explored for a wide range of applications like self-steering cars, voice and face recognition, and even online advertising. Thus, we may well soon see the rise of a new generation of intelligent machines capable of integrating the best of both computers and the human brain.
2012 © Optics & Photonics Focus
GV is Assistant Professor at Bilkent University in Ankara (Turkey); his research focuses on optics, statistical physics and soft matter (http://softmatter.bilkent.edu.tr/ ).
Romain Martinenghi, Sergei Rybalko, Maxime Jacquot, Yanne K. Chembo & Laurent Larger, Photonic Nonlinear Transient Computing with Multiple-Delay Wavelength Dynamics, Physical Review Letters (2012) 108, 244101 (link).