Giving human emotions to computers

Alejandro De León Languré
4 min readNov 11, 2022

--

Affective computing.

Artificial intelligence (AI) and machine learning (ML) are everywhere these days. Every software company worth its salt has a “revolutionary”, “game-changing” product based on AI, while ML capabilities are marketed as “unlimited”.

Humans have an analytical, intelligent side, but as we all can relate, emotions also play a massive part in our everyday decisions. If, as stated by Picard (2000), the end game is to produce machines that feel “more human”, shouldn’t we be investing some of our resources into “artificial emotions”?

In general, when referring to affective computing as the equivalent of artificial emotions, the main objective is to assign computers capabilities to understand and show affect features. It is not centered on building computers capable of feeling sadness or anger but instead on creating artificial agents able to adapt to the emotional state of the humans they interact with (Tao & Tan, 2005).

For instance, in automated customer support, affective computing could help a bot understand a customer’s sense of urgency by detecting keywords in his language or voice tone and adapting in consequence. No one wants to interact with an automated bot that, instead of showing concern and empathy for a customer’s situation, displays a permanent smile, effectively showing either detachment or even mockery in human language (Caruelle et al., 2022).

There is much research, but in a nutshell, it all starts with the selection (or discovery) of an emotion model (EM). An EM is a framework used to contextualize a message. Some authors create a spatial representation to calculate values for each emotion, such as the following figure from Yan et al. (2020).

Given a specific input, like a text, a speech, or a human face, numerical values can be calculated (through a process called “vectorization”). In the following example, a computer with a camera analyzes a human face, looking for key points, each one representing a numerical vector.

Transforming a human facial expression into a set of vectors.

A machine learning algorithm is a computer process where a vector set is transformed into a mathematical formula. First, a computer needs many pictures of human faces and their labels. For instance, we provide 1,000 pictures of a sad face, 1,000 with a happy face, 1,000 with an angry face, and so on. As we’ve already seen, each picture is transformed into a vector. Thousands of vectors are fed to a machine learning algorithm, and a generalization is produced: a mathematical formula that can receive a new, unknown vector (for example, a new human face) and tell us if it contains a sad, happy or angry emotion.

At this point, we have a computer that can look at any video feed and “understand” the emotional state of the persons in it by looking at their facial expressions.

This is obviously grossly oversimplified, but the main idea behind affective computing is to create enough generalizations (that is, mathematical formulas) so that computer programs can detect a wide variety of human emotions independently of the media (text, video, or voice). We must consider various factors, such as the situational, historical, cultural, and demographic context. There are many different emotion models and machine learning algorithms, each with different approaches and success rates.

What do you think? Can human emotions, regardless of the media they are captured from, be ultimately expressed as math formulas and predict their outcome?

— — — — — — — — — — — — — — — — — — —

References

Picard, R. W. (2000). Affective computing. MIT Press.

Tao, J., & Tan, T. (2005, October). Affective computing: A review. In International Conference on Affective computing and intelligent interaction (pp. 981–995). Springer, Berlin, Heidelberg.

Caruelle, D., Shams, P., Gustafsson, A., & Lervik-Olsen, L. (2022). Affective computing in marketing: practical implications and research opportunities afforded by emotionally intelligent machines. Marketing Letters, 33(1), 163–169.

Yan, Y., Zhang, Z., Chen, S., & Wang, H. (2020). Low-resolution facial expression recognition: A filter learning perspective. Signal Processing, 169, 107370.

--

--

Alejandro De León Languré
Alejandro De León Languré

Written by Alejandro De León Languré

Machine Learning PhD student at Tecnológico de Monterrey. I love coffee, video games, and coding. Currently researching Emotion Artificial Intelligence.

No responses yet