Data Justice Research Series: Using Sociolinguistics to Address AI Fairness by Zion Mengesha

Register
Date
May 27, 2026,
12:00 pm – 2:00 pm PDT
Location
3312 Murphy Hall, DataX Impact Forum

Join us for a Data Justice Research Series talk presented by Dr. Mengesha. Lunch will be provided!

Description: Over the past 60 years, sociolinguists have documented variation in African Americans’ speech. This work has resulted in a large body of literature detailing the complex relations among language, gender, sexuality, race, power, and class. The development language technologies, such as automated speech recognition (ASR) and large language models, has raised new questions about dialect fairness and accessibility, which sociolinguistics is apt to address.

In this talk, Zion Mengesha present three case studies for the application of sociolinguistics to artificial intelligence. Using the dialect density measure, the first study shows that all five major speech recognizers misunderstood African American speakers up to two times more than white speakers, revealing how speech technologies reproduce standard language ideologies. The second study examines the psychological and behavioral consequences of dialect discrimination using video data collected over a 2-week diary study of African Americans’ interactions with voice technology. The final study shows linguistic consequences of ASR misrecognition, examining how African Americans modify their prosody and morphosyntax in order to be better understood. She concludes with a discussion on how to apply sociolinguistic insights about African American English (AAE) to artificial intelligence to advance technological justice for speakers of African American English and other minoritized language varieties.