Anyone familiar with HR practices probably knows of the decades of studies showing that resumes with Black- and/or female-presenting names at the top get fewer callbacks and interviews than those with white- and/or male-presenting names—even if the rest of the resume is identical. A new study shows those same kinds of biases also show up when large language models are used to evaluate resumes instead of humans.
In a new paper published during last month’s AAAI/ACM Conference on AI, Ethics and Society, two University of Washington researchers ran hundreds of publicly available resumes and job descriptions through three different Massive Text Embedding (MTE) models. These models—based on the Mistal-7B LLM—had each been fine-tuned with slightly different sets of data to improve on the base LLM’s abilities in “representational tasks including document retrieval, classification, and clustering,” according to the researchers, and had achieved “state-of-the art performance” in the MTEB benchmark.
Rather than asking for precise term matches from the job description or evaluating via a prompt (e.g., “does this resume fit the job description?”), the researchers used the MTEs to generate embedded relevance scores for each resume and job description pairing. To measure potential bias, the resumes were first run through the MTEs without any names (to check for reliability) and were then run again with various names that achieved high racial and gender “distinctiveness scores” based on their actual use across groups in the general population. The top 10 percent of resumes that the MTEs judged as most similar for each job description were then analyzed to see if the names for any race or gender groups were chosen at higher or lower rates than expected.
A consistent pattern
Across more than three million resume and job description comparisons, some pretty clear biases began to appear. In all three MTE models, white names were preferred in a full 85.1 percent of the conducted tests, compared to Black names being preferred in just 8.6 percent (the remainder showed score differences close enough to zero to be judged insignificant). When it came to gendered names, the male name was preferred in 51.9 percent of tests, compared to 11.1 percent where the female name was preferred. The results could be even clearer in “intersectional” comparisons involving both race and gender; Black male names were preferred to white male names in “0% of bias tests,” the researchers wrote.
+ There are no comments
Add yours