i got so angry after reading this paper on LLMs and African American English that i literally had to stand up and go walk around the block to cool off https://www.nature.com/articles/s41586-024-07856-5 it's a very compelling paper, with a super clever ...
-
i got so angry after reading this paper on LLMs and African American English that i literally had to stand up and go walk around the block to cool off https://www.nature.com/articles/s41586-024-07856-5 it's a very compelling paper, with a super clever methodology, and (i'm paraphrasing/extrapolating) shows that "alignment" strategies like RLHF only work to ensure that it never seems like a white person is saying something overtly racist, rather than addressing the actual prejudice baked into the model
-
what's especially infuriating is that this outcome is *totally obvious* to anyone who knows the first thing about language, i.e., that even the tiniest atom of language encodes social context, so of course any machine learning model based on language becomes a social category detector (see Rachael Tatman's "What I Won't Build" https://slideslive.com/38929585/what-i-wont-build) & any model put to use in the world becomes a social category *enforcer* (see literally any paper in the history of the study of algorithmic bias)
-
and what's ADDITIONALLY infuriating is some engineer or product team at openai (or whatever) is going to read this paper and think they can "fix" the problem by applying human feedback alignment blalala to this particular situation (or even this particular corpus!), instead of recognizing that there are an infinite number of ways (both overt and subtle) that language can enact prejudice, and the system they've made necessarily amplifies that prejudice
-
every day i wake up in utter disbelief of the fact that people continue to take these products seriously as tools, especially in the realm of education. end rant. FOR NOW
-
I gasped at this: "exhibiting raciolinguistic stereotypes about speakers of African American English (AAE) that are more negative than any human stereotypes about African Americans ever experimentally recorded."
Wow thank you for highlighting this paper.
-
@aparrish wow. Thank you for sharing.
-
@aparrish The bias is in the language itself. You would have to speak a different language in order to avoid the anti-black bias, or sanitise English in a way that doesn't feel like English anymore. Same for other European languages.