In this blog post, we explore two sets of emotion combinations using word2vec. Specifically, one posited by Robert Plutchik in 1980 and the other popular media chart featured in vox.com using characters from Inside Out. We are limiting the scope to only dyads, i.e. the combination of two basic emotions that make up a more complex emotion.
Just as blue and red gives purple; joy and surprise gives delight.
Above: To give some context, the movie depicts how a child develops mixed-emotions when she turns adolescent. Here, a mix of joy and sadness is depicted.
To explore the additive nature of emotions, word2vec is a great candidate model. As illustrated in the previous posts, we saw that distributed representation models such as word2vec can solve the following analogies of varying complexity and suggest the italics words.
Man : King :: Woman : Queen
Lady Gaga: America :: Ayumi Hamasaki : Japan
Equivalently, the above word-pair analogies can be expressed as equations:
Man + King -Woman = Queen
Lady Gaga + America-Japan = Ayumi Hamasaki
In the following sections, we are demonstrating the additive properties of word vectors to find the italics word.
Joy + Surprise = Delight
Using the above code, the most similar word for the sum of two emotions can be extracted from word2vec, compute the cosine similarity between the suggested word and human suggestion. This similarity measure ranges from -1 (complete opposite) to 1 (identical meaning), and lastly, checks if the suggested emotion from a human is within the top 10 suggested words of word2vec.
The rest of the post is organized into 2 studies, each testing the agreement between word2vec and a specific set of suggestions. In each study, we would first present the results, followed by the discussion.
Study 1 — Plutchik
At this point, it is important to note that the pre-trained word2vec model has been trained on Google News dataset with about 100 billion words. Being a news-centered dataset, the corpus is not expected to produce a model that makes subtle distinctions between emotions.
Encouragingly, word2vec exhibits a general agreement with Plutchik’s suggestion emotions, with positive similar scores across all emotion pairs.
We observe that word2vec suggests the same words for a few of the emotion pairs. For example, sorrow is suggested in pair number 9, 11, 16, 17, 18. Whenever sadness is added with something, sorrow is being suggested. This highlights the limited distinction between emotions for a model trained with a News dataset.
Pair 13 is an interesting one, the sum of surprise and sadness. Plutchik suggested disapproval whilst word2vec suggested disappointment which I personally liked more. However, this is only an opinion of a layperson to psychology.
Pair 7 is another interesting one, word2vec thinks that the fear to trust negates the trust and suggests distrust, whilst Plutchik suggests that submission is the accumulation of fear and trust. Both seem to make sense, this shows that words can be meaningfully assembled together in more than one way. The combination could be modified by the lexical structure within the sentence.
Pair 21 receives an n/a because the word2vec dictionary does not have morbidness.
Study 2 — Inside Out
Findings are very similar to that of Study 1, where we observe a general agreement between the human suggestion and word2vec, and repetition of suggestions across emotion dyads (e.g. sorrow and frustration).
This concludes a short post illustrating another use of the versatile distributed representations of words, and highlights the importance of using a relevant corpus for training the word vectors if it is to be used in a specialized domain.
Originally published at joshuakyh.wordpress.com on January 30, 2018.
Gurupriyan is a Software Engineer and a technology enthusiast, he’s been working on the field for the last 6 years. Currently focusing on mobile app development and IoT.