Back to AI Intersections Database

AI trained to identify hate speech may actually end up amplifying racial bias.

Issue

AI Impact(s)

Bias and Discrimination

As Vox reported, "In one study, researchers found that leading AI models for processing hate speech were one-and-a-half times more likely to flag tweets as offensive or hateful when they were written by African Americans, and 2.2 times more likely to flag tweets written in African American [Vernacular] English (which is commonly spoken by Black people in the US)." Further investigation showed that "when moderators knew more about the person tweeting, they were significantly less likely to label that tweet as potentially offensive. At the aggregate level, racial bias against tweets associated with Black speech decreased by 11 percent." (…) "Another study found similar widespread evidence of racial bias against Black speech in five widely used academic data sets for studying hate speech that totaled around 155,800 Twitter posts."

Join our Insights email list to stay up-to-date in the fight for a better internet for all.

Country

I'm okay with Mozilla handling my info as explained in this Privacy Notice.

AI trained to identify hate speech may actually end up amplifying racial bias.

Want to suggest an update to the AI Intersections Database?

We all love the Web. Join Mozilla in defending it. Let's protect the world's largest resource for future generations.

Sign Up for News and Updates