OpenAI discovers that GPT-4o does some very odd things sometimes

August 9, 2024
Harsh Gautam

OpenAI's GPT-4o generative AI model, which powers the recently released alpha of Advanced Voice Mode in ChatGPT, is the company's first to be trained on voice, text, and image data. As a result, it will occasionally behave strangely, such as copying the voice of the person speaking to it or randomly shouting in the middle of a discussion.

In a new "red teaming" report describing explorations of the model's strengths and vulnerabilities, OpenAI reveals some of GPT-4o's odder idiosyncrasies, such as the previously reported voice copying. In rare cases—particularly when a person is chatting to GPT-4o in a "high background noise environment," such as a car on the road—GPT-4o will "emulate the user's voice," according to OpenAI. Why? OpenAI chalks it up to the model's inability to interpret distorted speech. Okay, fair enough!

To be clear, GPT-4o is not doing this right now—at least not in Advanced Voice Mode. An OpenAI spokeswoman tells TechCrunch that the business has implemented "system-level mitigation" for the behavior.

GPT-4o is also prone to producing uncomfortable or inappropriate "nonverbal vocalizations" and sound effects, such as sensual moans, angry screams, and gunshots, when encouraged in certain ways. According to OpenAI, there is evidence that the model normally declines attempts to generate sound effects, but certain requests do make it through.

GPT-4o may also infringe on music copyright—or would if OpenAI had not developed filters to prevent it. According to the source, OpenAI directed GPT-4o not to sing for the limited alpha of Advanced Voice Mode, presumably to prevent replicating the style, tone, and/or timbre of well-known musicians.

This implies, but does not explicitly confirm, that OpenAI trained GPT-4o on copyrighted material. It is unclear whether OpenAI intends to relax the limits when Advanced Voice Mode is released to additional users in the fall, as originally planned.

"To account for GPT-4o's audio modality, we updated certain text-based filters to work on audio conversations [and] built filters to detect and block outputs containing music," according to the report from OpenAI. "We trained GPT-4o to refuse requests for copyrighted content, including audio, consistent with our broader practices."

Notably, OpenAI has stated that it would be "impossible" to train today's leading models without incorporating copyrighted information. While the company has a variety of licensing agreements with data suppliers, it also believes that fair use is a valid defense against charges that it trains on IP-protected data, such as songs, without permission. 

The red teaming report, for what it's worth, considering OpenAI's stake in the race, paints a picture of an AI model that has been made safer by different mitigations and safeguards. GPT-4o, for example, refuses to identify persons based on their speech patterns and declines to answer loaded questions such as "how intelligent is this speaker?" It also prevents prompts for violent and sexually charged language and completely prohibits certain types of content, such as discussions about extremism and self-harm.