Emotional Recognition: Can AI Have Your Attention Please?

Andrew W. Pearson
Product AI
Published in
6 min readNov 20, 2021

--

“Our faces are organs of emotional communication; by some estimates, we transmit more data with our expressions than with what we say,” says Raffi Khatchadourian in his article We Know How You Feel. These expressions are now taking center stage in what’s become known as “emotional communication” technology. Companies like Emotient, Realeyes, Sension, and Affectiva are competing to create emotionally responsive machines dedicated to decoding the human face, a notoriously difficult subject to interpret. The BBC’s claim that there are no less than 19 types of smiles a person can make — but only six are for happiness — should give pause to the idea that reading people’s faces is straightforward.

In his article, Khatchadourian profiles Affectiva, a startup specializing in cutting-edge AI technologies that applies machine learning, deep learning, and data science to emotional intelligence. Affectiva’s AI identifies deep patterns in vocal pitch, rhythm, and intensity, measures sentiment by assessing word arrangement, as well as reads a subject’s facial expressions and body gestures, says Khatchadourian.

In his book Architects of Intelligence, Martin Ford lays out Affectiva’s CEO Rana el Kaliouby’s thesis, that, in the future, an interface between humans and machines will become ubiquitous. Whether in our car, on our phone, at home or in the office on our smart devices, or even while patronizing a business, human-machine interfaces will become second nature to us. “Ten years down the line, we won’t remember what it was like when we couldn’t just frown at our device, and our device would say, ‘Oh, you didn’t like that, did you?’” claims Kaliouby.

Affectiva’s flagship product, Affdex, tracks four emotional “classifiers” — happy, confused, surprised, and disgusted. The software identifies the face’s main region — mouth, nose, eyes, eyebrows — and ascribes points to each area, explains Khatchadourian. Affdex also tracks the distribution of wrinkles around an eye or the furrow of a brow and “combines that information with the deformable points to build detailed models of the face as it reacts,” contends Khatchadourian.

The human face is a constantly shifting landscape of incredibly nuanced movements, says Khatchadourian. In some cases, it’s almost designed to hide and obfuscate emotions, as any good poker player will tell you. In his poem The Love Song of J. Alfred Prufrock, T.S. Eliot wrote that one needs to take the time “to prepare a face to meet the faces that you meet.” Transparency is far from our face’s best look. A programmer trying to teach a computer to understand the meaning of a look must contend with an almost infinite amount of contingencies, says Khatchadourian. “The process requires machine learning, in which computers find patterns in large tranches of data, and then uses those patterns to interpret new data,” he adds.

Your Attention, Please?

So why the single-minded focus on emotion? In a word, attention. Thales Teixeira, one of Kaliouby’s collaborators, states that individuals possess three major fungible resources; money, time, and attention. The latter of which is the least explored, he claims. Using Super Bowl ads as a rough indicator of the high-priced end of the market, Teixeira “determined that in 2010 the price of an American’s attention was six cents per minute. By 2014, the rate had increased by twenty percent — more than double inflation.” This is because capturing people’s attention is getting harder and harder, which isn’t surprising as the average person is now being inundated with between 4,000–10,000 advertisements per day. This, in turn, means marketers must focus on capturing the intensity of the attention, explains Teixeira. “People who are emotional are much more engaged,” says Teixeira, adding, “emotions are ‘memory markers,’ which people remember more. The goal is now shifting from capturing a person’s attention to finding those “people who are feeling these emotions?” contends Teixeira.

Recognizing the marketing potential in this space, Affectiva filed a patent for “a system that could dynamically price advertising depending on how people responded to it.” However, the company soon discovered they were not alone; Amazon, Google, Sony, Microsoft, AOL, Hitachi, eBay, IBM, Yahoo!, and Motorola were developing similar technology. Sony was even designing games that created emotional maps of players, combining data from sensors and social media to create a “‘dangerous kind of interactivity,” as Khatchadourian describes it.

Quite disturbingly, Verizon had devised a plan for “a media console packed with sensors, including a thermographic camera (to measure body temperature), an infrared laser (to gauge depth), and a multi-array microphone.” The system gleaned all kinds of personal information, including what language was being spoken, and whether the speaker had an accent, says Khatchadourian. The console could even track moments of laughter or the raised voice of an argument, which could feed the console’s choice of TV ads, contends Khatchadourian. Capture a fight between husband and wife, show a commercial for marital counseling on the TV or radio.

Although seemingly quite Big Brotheresque, Verizon’s system was typical. Microsoft’s Xbox One uses a technology called Time of Flight that tracks movements of individual photons, “picking up minute alterations in a viewer’s skin color to measure blood flow, then calculate [sic] changes in heart rate.” As Khatchadourian sees it, this could make digital games much more immersive. Microsoft also envisioned emotion-targeting TV ads whose prices depended on how many viewers watched the TV in the room. Of course, this opens up a whole host of ethical questions, to say nothing of creating an industry of workaround technology that would be used to fool Microsoft’s sensors to capture the cheapest ad rates possible.

Of course, some of this data collection seems to be over-the-top, but consumers aren’t pushing back as brands not just nudge privacy boundaries but step right over them. Wearables like Nike’s FuelBand and Fitbit, as well as at-home exercise products like Peleton and Lululemon’s Mirror, collect a tremendous amount of health data on their users. According to Forbes, the Lululemon Mirrors collect and transmit “detailed whole body video capture, voice and audio capture, settings configurations, personal profiles, and usage data.” Meanwhile, Apple’s Health app tracks weight, respiratory rate, sleep, even blood-oxygen levels. All of this information can be collected and used to create a user’s emotional profile, says Khatchadourian, which can be fed into marketing algorithms that not only pitch the right product at the right price but also at the right emotional moment with the most compelling emotional or behavioral message. This is personalization marketing on steroids.

What’s in a Smile?

For Affectiva, interest in its Affdex solution has grown rapidly and it recently agreed to be acquired by Smart Eye. The former competitors believe the acquisition will help them break into the emerging “interior sensing” market, which, according to Techcrunch, “can be used to monitor the entire cabin of a vehicle and deliver services in response to the occupant’s emotional state.”

Previously, Affective’s use cases had been quite eclectic. It had experimented with video ads for Facebook, licensed its software to Samsung for various use cases, as well as given digital nurses the ability to read faces, says Khatchadourian. A Belfast entrepreneur wanted to use the technology in his chain of nightclubs while Dubai saw it as a way to measure social contentment, adds Khatchadourian.

Affectiva’s biggest use case will certainly be in marketing. Its ability to track moment-by-moment emotion can help advertisers understand customer engagement. As Khatchadourian explains, it can help identify and chart the most emotionally engaging moments of an ad so marketers can retain the most impactful parts of the ad when they want to cut down long TV ads to shorter online ones. For TV show character analysis, Affdix can help producers map out an audience’s engagement with a particular character. This can be particularly important when new actors are introduced into long-running series or character’s arc on a show needs to be expanded, contends Khatchadourian.

All-in-all, Affectiva could be revealing the future of customer engagement. Unquestionably, a lot of the data it collects is Big Brotheresque, but that doesn’t mean it’s going to be stopped. Sure some privacy advocates are pushing back and Congress will probably get involved soon with some type of legislation when the practice gets too out of hand. All this data collection is incredibly seamless, which means it will probably be used in all kinds of technology and advertising for years to come.

According to Zaria Gorvett, “Our grins are not as simple as they seem. There are a myriad different ways to smile — and some of them can conceal some less than happy feelings.” Companies like Affective, Emotient, Realeyes, Smart Eyes and Sension are looking into this and they all have their work cut out for them, it’s not just Mona Lisa who has an enigmatic smile. We all do, at one time or another. However, progress is being made. As time goes by, we might learn a smile is not just a smile but rather the first act of what we hope turns into a beautiful customer relationship.

--

--

Andrew W. Pearson
Product AI

Andrew Pearson is the MD of Intelligencia, an AI company based in Asia. Speaker, author, columnist, Pearson writes about IT issues like AI, CI, and analytics.