In a significant move toward advancing inclusivity in technology, Howard University and Google Research have unveiled a new dataset designed to enhance how automatic speech recognition (ASR) systems serve Black users. The collaboration, part of Project Elevate Black Voices, involved researchers traveling nationwide to document the unique dialects, accents, and speech patterns commonly found in Black communities, features often misinterpreted or ignored by current AI systems.
The project spotlights African American English (AAE)—also known as African American Vernacular English, Black English, Ebonics, or simply “Black talk”—a culturally rich and historically rooted linguistic form. Due to systemic bias in the development of AI tools, Black users have frequently encountered errors or been misunderstood by voice technologies, sometimes feeling pressured to alter their natural speech just to be recognized by these systems— a classic form of code switching.
Researchers at Howard University and Google are on a mission to change this.
“African American English has been at the forefront of United States culture since almost the beginning of the country,” shared Gloria Washington, Ph.D., a Howard University researcher and the co-principal investigator of Project Elevate Black Voices, in a press release. “Voice assistant technology should understand different dialects of all African American English to truly serve not just African Americans, but other persons who speak these unique dialects. It’s about time that we provide the best experience for all users of these technologies.”
To build this groundbreaking dataset, researchers gathered 600 hours of speech from participants representing various AAE dialects across 32 states. The goal was to confront hidden barriers that hinder the effectiveness of automatic speech recognition (ASR) systems for Black users. One of the key findings was that AAE is…
Read the full article here