Unsupervised methods, on the other hand, often rely on building a dictionary of scores for different words. One such dictionary developed by my collaborators asked people to give a 1 to 9 happiness score to different words, and then averaged the results: “rainbows”, for example, scored 8.06, while “useless” gets 2.52.

The overall sentiment of a phrase can then be scored by looking at all the words in the post. For example, the average score for the post “My momma always said ‘life is like a box of chocolates’” is an above-average 6.02 according to this dictionary, suggesting it expresses a positive feeling.

What is sentiment analysis used for?

Sentiment analysis is increasingly used by marketers to study trends and make product recommendations.

Imagine a new mobile phone is released; a sentiment analysis of social media posts about the phone may give a company valuable, real-time insight into how it’s performing.

There are broader applications of sentiment analysis. Researchers have recently tracked Donald Trump’s Twitter sentiment over the first 100 days of his presidency and built bots to place market trades when he tweets positively or negatively about specific companies.

Scientists can track emotional trends in other texts as well. For example, we used sentiment analysis to study the emotional arcs of more than 1,000 films through their screenplays.

Many films show similar patterns: regular peaks and troughs of tension and release, followed by a particularly big trough 80% of the way through the film (all hope is lost!), before the final resolution and happy ending. Applying a similar analysis to novels, we showed that most stories follow one of six basic story arcs.

We’re still not that good at sentiment analysis

Given that sentiment analysis often relies on mining social media posts, it raises major ethical concerns, and this debate is only beginning. Yet the complex nature of language and meaning makes it prone to error.

Take the phrase, “May the force be with you”, which scores 5.35 using our dictionary’s analysis. For any Star Wars fan, it is of course a hugely positive phrase, but it scored modestly in our test because the word “force” is rated a below-average 4.0.

This is understandable when rating this word in isolation, but in context it makes less sense.

Some scepticism of the validity of Facebook’s sentiment analysis capabilities is therefore warranted. It’s entirely conceivable that describing something as “fully sick” on Facebook, a phrase of colloquial endorsement, could lead to an individual’s emotional state being misclassified.