All human feelings are manifested not in words, but in facial expressions that show much more than most of us realize. Even if we do not want to reveal our thoughts, we are betrayed by body language, facial expressions. 90 percent of communication is non-verbal, which may surprise non-professionals, but has long been a rule of thumb for communication professionals. Many of these signals we are not even able to control, they appear involuntarily and regardless of our origin or cultural level.
This is especially true for microexpressions, facial expressions that slip by for only a fraction of a second and are beyond conscious control. In addition, they are very difficult to imitate, and therefore they are considered a fairly reliable emotional signaling system. To an inexperienced eye, they are usually not noticeable, but the camera grabs them without problems. Here are used so-called emotional computing algorithms (Affective Computing), when faces are analyzed by their expressions, which are usually classified into six or seven categories.
According to the Facial Movement Coding System (SKLID) Facial Action Coding System (FACS)), developed in the 1970s by Paul Ekman and Wallace Friesen, these include anger and fear, indignation and disgust, sadness, surprise and happiness. More advanced systems use more than 20 measurement values. Facial expressions and emotions do not depend on cultural factors, as shown by his research conducted among the population of Papua New Guinea, far from the media and cultural influences of other countries. Facial expressions and emotions are equally expressed throughout the world, they are universal and are innate.
Can AI identify intruders?
Now the functionality of the programs has expanded to such an extent that they are able to analyze images in real time, which opens up a huge range of possibilities for their application. Since the beginning of the year, the Transportation Security Administration (TSA) has been testing biometric facial recognition technologies as part of a pilot program to verify the identity of the passenger with his documents.
It is easy to imagine that AI is additionally used for emotion recognition in order, for example, to identify possible terrorists among passengers. Companies are already using emotion recognition to improve their business performance.
Disney knows in advance when the audience will laugh
Disney uses facial recognition technology to gauge audiences’ emotional responses. To track the facial expressions of people who watch movies, an algorithm called factorized variational autoencoders (FVAE) was developed. Already after a ten-minute analysis of the viewer’s face, it is possible to predict the future expressions of this face in the future course of viewing.
FVAE lays out images of viewers’ faces as a series of numbers based on certain features: one number for the smile of a certain face, another for the width of the eyes, and so on. The Disney team applied FVAE to more than 3,000 viewers across multiple films and identified 68 measurement points per face, resulting in 16 million individual face shots. Given enough information, the system can accurately predict human responses after just a few minutes of observation.
By the way, the technology is not limited to faces alone. FVAE can, for example, analyze how trees react to wind depending on their type and size.
The voice also conveys emotions.
In addition to facial expressions and body position, our voice also betrays our emotional state. Reason enough for researchers around the world to work on the possibilities of automated emotion recognition.
Back in 2016, Matthew Fernandez and Akash Krishnan, students at MIT and Stanford University, developed an algorithm that can recognize dozens of emotions from human speech. The so-called Simple Emotion algorithm tracks the acoustic characteristics of speech sounds, such as voice frequency, volume, and pitch changes, and compares them to a library of sounds and tones. It identifies the emotion by finding the closest match in the catalog.
Speech analysis tools may be of interest to companies that want to improve their customer service. As you know, few things can annoy hotline callers more than talking to an indifferent call center employee or robot after waiting for a connection. And here an algorithm comes to the rescue, giving real-time feedback on the emotional state of the caller. This can give the caller the impression that he was taken seriously and with understanding. For call center employees, this will mean less stress. This tool can also be used for quality assurance or training.
American psychologist Paul Ekman introduced a distinction between six basic emotions. They cannot be learned, they are innate: fear, anger, sadness, joy, disgust and surprise.
But voice and facial expressions are not the only thing that betrays your emotions. Instead of a voice or facial expression, the Moxo device worn on the wrist uses the resistance of the skin. Its changes, as in the case of the use of a lie detector, provide information about the prevailing emotion at the moment. An emotion measuring instrument is intended primarily for use in market research.
How AI reads “between the lines”
The situation with texts is somewhat more complicated. How can one derive feelings from written words and sentences that even animated readers cannot always understand (remember school literature lessons!). Bjarke Felbo, a Danish fellow at the Massachusetts Institute of Technology, developed a particularly ingenious way in 2017 to teach AI to read between the lines. Emoji are his main tool for this.
In fact, Felbo wanted to develop a system that would better recognize racist posts on Twitter. But he soon realized that many records cannot be correctly interpreted without an understanding of irony or sarcasm. Because Twitter users don’t use face, body language, or tone of voice to communicate, they need other means of making their messages sound right by using emoji, explains Iyad Rahwan, Felbo’s research advisor at MIT. “The neural network has learned the connection between a certain way of expression and emoji.”
Emoji: Attention, sarcasm!
Using an algorithm dubbed DeepMoji, the researchers analyzed 1.2 million tweets that contained a total of 64 different kinds of emoji. First, they taught the system to predict which emoji would be used with a particular message, depending on whether it expressed happiness, sadness, laughter, or something else. After that, the system learned to recognize sarcasm based on the available data set for the relevant categories of examples.
The researchers even gave the AI its own website to showcase part of the system that makes up emoji. The program automatically links one or more matching emoji to the English text and seems to work quite efficiently. Difficulties arise only with Donald Trump’s tweets, which clearly confuse Deepmoji, just like all other flesh and blood readers.
The Meaning and Purpose of Pattern Recognition
After the excitement around new technical possibilities subsides, the question remains about the deeper meaning of emotion recognition. After all, machines equipped with such AI do not develop any feelings, they do not even understand them. They only stubbornly and unwaveringly analyze the endless series of numbers. A variety of forms of expression are decomposed for algorithms into images and graphs, which are checked for patterns and features through image recognition. This can give people the illusion that they are dealing with an empathetic interlocutor.
Such programs will no doubt soon be able to pass any Turing test. But this success is not least due to the fact that human understanding is also based on the recognition of patterns and is always looking for something familiar in the unfamiliar. All Rorschach tests are based on this. So the fear remains that there will be a foundation for even more control or even more sophisticated manipulation. Or the hope that a reasonable application will still be found.
- Top 5 fastest computers in the world
- What to build a “smart” house from: remote control, sensors and a kettle