OpenAI releases voice agent

金色财经|3月 20, 2025 22:51

According to a report by Golden Finance, OpenAI conducted a technical live broadcast at 1am today and released three new speech models specifically designed for developing speech AI agents. Two are the speech to text models GPT-40 Transcribe and GPT-4 Mini Transcribe; One is the text to speech model GPT-40 Mini TTS. It is worth mentioning that developers can control the speech emotion and style of the GPT-40 Mini TTS model. OpenAI has added a powerful streaming mode to the speech to text API, allowing developers to input continuous audio streams into the model in real time. The model can also return continuous text and responses in real time. This real-time interactive feature is very helpful for application scenarios that require immediate feedback, such as real-time voice dialogue systems, voice conference transcription, etc. (AIGC Open Community)