Algorithms are used to determine credit scores, automatic translations and which photos show up first in internet searches — and they often favor white people over Black people. Is this also a problem on Twitter?
Are Twitter algorithms inherently racist? To find out, US programmer Tony Arcieri recently launched a unique experiment to see how the platform would crop pictures in preview mode.
Arcieri uploaded large photo collages of former US President Barack Obama and Republican Senate Leader Mitch McConnell, with their faces placed in different spots in the various versions. Twitter’s image preview function automatically crops photos which are too big for the screen, selecting which part to make visible to users. The idea was to force the algorithm to choose one of the men’s faces to feature in the tweet’s image preview.
But for every iteration, Twitter’s algorithm cropped out Obama’s face, instead focusing on McConnell, a white politician. Arcieri tried changing other parts of the image, including the color of the ties the men were wearing, but nothing worked in Obama’s favor. It was only when Arcieri inverted the picture’s colors that Obama was finally featured.
Trying a horrible experiment…— Tony “Abolish (Pol)ICE” Arcieri 🦀 (@bascule) September 19, 2020
Which will the Twitter algorithm pick: Mitch McConnell or Barack Obama? pic.twitter.com/bR1GRyCkia
Appalled by the results, Arcieri tweeted: “Twitter is just one example of racism manifesting in machine learning algorithms.” Other users also managed to replicate Arcieri’s experiment and confirm his results.
It’s unclear why Twitter’s algorithm prioritized McConnell over Obama. DW conducted a similar experiment with images of German football player Jerome Boateng, who is Black, and Bastian Schweinsteiger, who is white. The experiment was also done with pictures of actors Will Smith and Tom Cruise. In both instances, DW found Twitter had cropped out and hidden both white individuals. Other Twitter users who carried out similar experiments with other images got mixed results. It appears the question isn’t so easily answered.
I tried to replicate Twitter's racist cropping on (almost) random pictures. Based on 8 pictures, Twitter chose the lighter-skinned one 4 times (see it at https://t.co/sEwSXYPcW4) More data needed I guess. pic.twitter.com/sgwJ2dV4MH— Nicolas Kayser-Bril (@nicolaskb) September 20, 2020
Twitter spokeswoman Liz Kelley has acknowledged the issue and said the social media platform would investigate. She also said Twitter would publish the code behind the platform’s image-cropping feature, which uses artificial intelligence and has been in place since 2017. It’s not entirely clear how the code works; Twitter said it has detected no discrimination in its preliminary tests.
Data selection is key for algorithms
Nicolas Kayser-Bril of AlgorithmWatch, a Berlin-based non-governmental organization, said making Twitter’s preview code public will achieve little. Kayser-Bril, who works as a data journalist, told DW that the algorithm’s results are more complicated than that.
“The algorithm’s behavior is determined by the code, but also by the training data,” he said. “However, how the data is collected and what part of that data is ultimately made available to the algorithm is subject to many social factors.”
The Twitter image-cropping algorithm takes that data and tries to create rules to determine how to best crop an image. But humans determine which data is processed and then used to train the algorithm. Whether data is derived from a police registry, or a social platform like Instagram, will have major consequences. Often, algorithms are trained using a collection of images featuring predominantly white faces.
An algorithm’s intrinsic racial bias may be amplified when used on platforms featuring more pictures of white users. This could increase the odds of non-white people being cropped out of preview pictures. If these were then used to teach another algorithm, this racial bias would be further entrenched.
This amplification effect can be compared with how Amazon handles recommendations: people who buy lots of political thrillers will see many recommendations for similar books, for example. Kayser-Bril argues such codes can be dangerous, saying that “machine-learning algorithms reinforce every bias.”
There are countless instances in which algorithms have been found to discriminate against non-white people. Kayser-Bril recently studied Google Cloud Vision, a service that identifies and categorizes elements portrayed in images. This product could help to accelerate the speed at which images containing violent or erotic scenes are spotted and removed from social media platforms.
Kayser-Bril used two images to test how the service would classify them: one featuring a white person with a modern thermometer that measures forehead temperature, another with a black person holding the same device. Google Cloud Vision interpreted the white person as holding a small telescope, yet interpreted the exact same thermometer as a gun in the hand of the Black person.
Black person with hand-held thermometer = firearm.— Nicolas Kayser-Bril (@nicolaskb) March 31, 2020
Asian person with hand-held thermometer = electronic device.
Computer vision is so utterly broken it should probably be started over from scratch. pic.twitter.com/hvqceJVZPv
Kayser-Bril has studied other algorithms to see whether they further racial biases. Google Perspective, designed to help moderators filter out insulting and hate-filled online comments, proved similarly problematic. Kayser-Bril found that in 75% of cases the software classified the sentence “I am saying this as a Black woman” as toxic — whereas this was the case only in 3% of cases regarding the sentence “I am saying this as French person.”
The data journalist also found that one Google program misinterpreted the image of an African-American woman as a gorilla. Further examples abound. Search the term “professional haircuts” using Google, and you’ll likely be presented almost exclusively with pictures of braided hairstyles featuring blonde-haired white people. Kayser-Bril found that Google Translate also presented bias when translating in certain languages, as the tool “systematically changes the gender of translations when they do not fit with stereotypes.”
Misogyny and digital colonialism rolled into one – yet another example of the side effects of Google's free but quasi-mandatory services. https://t.co/MJnmPKhqRu— Nicolas Kayser-Bril (@nicolaskb) September 17, 2020
How algorithms influence everyday life
In the recent past, algorithms have caused serious problems in everyday life. A software used by a New Zealand passport agency, for instance, did not recognize passports of Asian-looking individuals, deeming them to have their eyes shut.
In the United Kingdom, where students were unable to sit their final exams earlier this year due to the COVID-19 pandemic, an algorithm was used to determine pupils’ final high school grades. Students from schools which had performed poorly in the past were disadvantaged by the algorithm, receiving below-average grades. This sparked a massive outcry, and the decision was reversed.
Algorithms have also proven highly controversial in the context of predictive policing software. It is used to estimate which areas and streets will be see the highest prevalence of crime and thus need extra policing. There is also similar software designed to help judges determine the appropriate jail sentence for criminals.
No solution in sight to fix racial bias
There have been discussions surrounding the issue of racially biased algorithms within the machine leaning and AI community for a while. Some experts argue this bias could be offset by ensuring datasets used to train algorithms are more balanced when it come to minorities.
But this has proven challenging to implement, in part because some ethnicities are underrepresented in online image databases. In addition, there is no uniform definition of fairness according to which this imbalance could be corrected.
Another option would be to understand why exactly algorithms make certain decisions. Why, for instance, was the Mitch McConnell picture prioritized over the Obama picture by the Twitter algorithm? Was it because of his skin color or, in fact, for another reason, such as his wide smile? Finding an answer to questions like these, however, is highly complex, as algorithms base their decisions on billions of different factors.
The problem at hand is clear, yet a solution will take some time to develop — and time is of the essence, according to the experts. “We must deal with this now, because these systems are the basis for the technologies of the future,” said Margaret Mitchell of Google Research. Twitter’s chief technology officer Parag Agrawal agrees, tweeting that he welcomes the “public, open, and rigorous test” of his company’s algorithms.