OpenAIâs Transcription Tool Hallucinates. Hospitals Are Using It Anyway

On Saturday, an Associated Press investigation revealed that OpenAI’s Whisper transcription tool creates fabricated text in medical and business settings despite warnings against such use. The AP interviewed more than 12 software engineers, developers, and researchers who found the model regularly invents text that speakers never said, a phenomenon often called a âconfabulationâ or âhallucinationâ in the AI field.

Upon its release in 2022, OpenAI claimed that Whisper approached âhuman level robustnessâ in audio transcription accuracy. However, a University of Michigan researcher told the AP that Whisper created false text in 80 percent of public meeting transcripts examined. Another developer, unnamed in the AP report, claimed to have found invented content in almost all of his 26,000 test transcriptions.

The fabrications pose particular risks in health care settings. Despite OpenAIâs warnings against using Whisper for âhigh-risk domains,â over 30,000 medical workers now use Whisper-based tools to transcribe patient visits, according to the AP report. The Mankato Clinic in Minnesota and Childrenâs Hospital Los Angeles are among 40 health systems using a Whisper-powered AI copilot service from medical tech company Nabla that is fine-tuned on medical terminology.

Nabla acknowledges that Whisper can confabulate, but it also reportedly erases original audio recordings âfor data safety reasons.â This could cause additional issues, since doctors cannot verify accuracy against the source material. And deaf patients may be highly impacted by mistaken transcripts since they would have no way to know if medical transcript audio is accurate or not.

The potential problems with Whisper extend beyond health care. Researchers from Cornell University and the University of Virginia studied thousands of audio samples and found Whisper adding nonexistent violent content and racial commentary to neutral speech. They found that 1 percent of samples included âentire hallucinated phrases or sentences which did not exist in any form in the underlying audioâ and that 38 percent of those included âexplicit harms such as perpetuating violence, making up inaccurate associations, or implying false authority.â

In one case from the study cited by AP, when a speaker described âtwo other girls and one lady,â Whisper added fictional text specifying that they âwere Black.â In another, the audio said, âHe, the boy, was going to, Iâm not sure exactly, take the umbrella.â Whisper transcribed it to, âHe took a big piece of a cross, a teeny, small piece â¦ Iâm sure he didnât have a terror knife so he killed a number of people.â

An OpenAI spokesperson told the AP that the company appreciates the researchersâ findings and that it actively studies how to reduce fabrications and incorporates feedback in updates to the model.

Why Whisper Confabulates

The key to Whisperâs unsuitability in high-risk domains comes from its propensity to sometimes confabulate, or plausibly make up, inaccurate outputs. The AP report says, “Researchers arenât certain why Whisper and similar tools hallucinate,” but that isn’t true. We know exactly why Transformer-based AI models like Whisper behave this way.

Whisper is based on technology that is designed to predict the next most likely token (chunk of data) that should appear after a sequence of tokens provided by a user. In the case of ChatGPT, the input tokens come in the form of a text prompt. In the case of Whisper, the input is tokenized audio data.

Source link

Breaking News

Charlie Cox isn’t kidding – the Daredevil star thinks he’d have “a lot of fun” playing DC’s Joker

Premier League Soccer: Livestream Chelsea vs. Wolves From Anywhere

Spinach Stuffed Chicken Breast – Budget Bytes

Nothing personal: US government wants to shove 10 million Gigabytes RAM in a 'computer' to do '3D simulation'

As Dragon Age: The Veilguard director leaves and studio closure rumors surface, Mass Effect 5 director appears to hint the RPG is just fine: “Every week is a good week at work for me”

Jhon Duran transfer news: West Ham make £57m bid for Aston Villa striker, but offer is rejected | Transfer Centre News

How to watch High Potential online from anywhere

The #1 Anti-Inflammatory Ingredient You Should Be Adding To Your Smoothies, According to a Dietitian

Millions of hotel guest reservations leaked in Otelier data breach

Charlie Cox isn’t kidding – the Daredevil star thinks he’d have “a lot of fun” playing DC’s Joker

Premier League Soccer: Livestream Chelsea vs. Wolves From Anywhere

Spinach Stuffed Chicken Breast – Budget Bytes

Nothing personal: US government wants to shove 10 million Gigabytes RAM in a 'computer' to do '3D simulation'

OpenAIâs Transcription Tool Hallucinates. Hospitals Are Using It Anyway

Why Whisper Confabulates

More From Author

Charlie Cox isn’t kidding – the Daredevil star thinks he’d have “a lot of fun” playing DC’s Joker

Premier League Soccer: Livestream Chelsea vs. Wolves From Anywhere

Spinach Stuffed Chicken Breast – Budget Bytes

+ There are no comments

Cancel reply

Anker Soundcore Boom 2 Review: This Bluetooth Speaker Is an Excellent Value

Microsoft issues warning for ongoing Russia-affiliated spear-phishing campaign

You May Also Like:

Charlie Cox isn’t kidding – the Daredevil star thinks he’d have “a lot of fun” playing DC’s Joker

Premier League Soccer: Livestream Chelsea vs. Wolves From Anywhere

Spinach Stuffed Chicken Breast – Budget Bytes

Nothing personal: US government wants to shove 10 million Gigabytes RAM in a 'computer' to do '3D simulation'

As Dragon Age: The Veilguard director leaves and studio closure rumors surface, Mass Effect 5 director appears to hint the RPG is just fine: “Every week is a good week at work for me”

Jhon Duran transfer news: West Ham make £57m bid for Aston Villa striker, but offer is rejected | Transfer Centre News

How to watch High Potential online from anywhere

The #1 Anti-Inflammatory Ingredient You Should Be Adding To Your Smoothies, According to a Dietitian

Breaking News

Top Tagged

Why Whisper Confabulates

+ There are no comments

Anker Soundcore Boom 2 Review: This Bluetooth Speaker Is an Excellent Value

Microsoft issues warning for ongoing Russia-affiliated spear-phishing campaign