Queer = Bad in Automatic Sentiment Analysis
Is it good or bad? Sentiment analysis is the most basic summary of a text: a way to break the overwhelming ambiguity of it into two distinct categories. Sentiment analysis is widely studied in NLP and is used in a variety of settings e.g. companies monitoring how people rate their products (bad reviews vs. good reviews), sociologists figuring out how a certain topic is covered in the media (negative descriptions vs. positive descriptions) or medical professionals observing how patient reports change over time (do they get better, do they get worse?). But what if queer identity terms make sentiment analysis systems go haywire? Research on this topic suggests that sentiment analysis systems have a hard time time telling neutral terms from emotionally laden ones, and queer identity terms are a good example.
Before machine learning experienced its comet-like ascent in NLP, sentiment analysis systems were lexicon based. This means that someone manually compiled a list with words that are highly positive or negative and assigned them with a numeric sentiment value. Text that contained many highly rated positive words (like good, awesome, fantastic) would get a high sentiment score and text with lots of negative words (like bad, awful, terrible) would get a low score. Queer identity terms had bad chances in these systems, because often they were reclaimed slurs, like the word queer itself. Only context could disambiguate when “queer” is used as an identity term and when it was used as a word expressing displeasure with something. And context was something that lexicon-based approaches, with their focus on single words and static rules, could rarely capture.
Machine Learning models should, at least in theory, be able to amend this problem. After all, they can take context into account. Instead of focusing on single words, a model could be designed to take whole sentences or paragraphs into consideration: Rather than rating a certain word as positive or negative, the rating could apply to a whole statement. But machine learning methods learn from large data sets that are often compiled without much human oversight. The data sets that are simply too big for a single person to read. Because we live in a generally queerphobic world, we can assume that people will say something negative with queer identity terms in it. When I was in high school, about 15 years ago, “this looks gay” was an insult I’ve heard slung around on a daily basis. Data sets have not moved much past that.
So because queer identity terms and negative judgement lie closely together in data sets, machine learning algorithms make the false connection between negative sentiment and queer identity terms. So even when these terms are used as neutral terms to describe something (“I loved this gay romantic comedy”) the sentiment analysis would rate it lower than if the word “gay” wasn't there. While this just means that sentiment analysis systems don’t work as well as they should, this could also have negative consequences for the queer community as a whole. For one, it might make the overall reviews of gay romantic comedies look worse than they are, maybe resulting in less funding when someone tries to find investors for the next gay romantic comedy of the year.
So how can we approach this problem? The first step is recognising and documenting that it exists. To do so Ugless et al. conducted studies with templates that compare the sentiment scores from several commercial tools (some of them claiming to be “debiased”, meaning that they explicitly addressed the described problem). The setup is simple: Does changing words that are considered sentiment neutral change sentiment values of the whole statement? Ugless et al. confirm that suspicion as correct. In addition they find that words that indicate several marginalised identities like queer terms from African American Vernacular English get even lower sentiment scores than queer identity terms from standard American English. We can imagine the negative consequences of that: The gay romantic comedy might seem to have even worse reviews, when the protagonists are Black.
One way to move against this trend is better data curation. Models have been proposed that search for stereotypic associations and replace them in the corpus, so that false correlations (like queer equals bad) are not learned by the models. But generally this points to a more fundamental problem in machine learning: When we learn patterns from data alone, how can we untangle the patterns that we want to be learned, from the ones that we don’t want to be learned? Can we do so after systems are already trained or can we place interventions in the training process? What are ways in which we can integrate human oversight into the model building process, without making it overwhelmingly time and cost intensive?
In rule-based systems minority cases could be easily overlooked or dismissed owing to the difficulty of implementing rules for every single edge case. Machine learning models relieve us of that problem at the price of a different one: What correlations in the data actually map to rules in the real world? By default queer data is always in the minority. Anti-queer bias is a litmus test, not only for sentiment analysis models, but for machine learning models that deal with human data in general. Is your model good or bad? It is only good, if it can handle queer.
You can find the full paper by Ugless et al. here.