Officially leading the way in meme creation.
Just 24 hours after DeepSeek released the V3 model 0324 version update, OpenAI seems to have a bit of a competitive spirit, announcing a new product release preview in the early hours of March 26, Beijing time.
Although there were some rumors speculating that GPT-5 might be released this time, based on OpenAI's past product release patterns, this will not be a major update. However, the new version of Sora integrated into ChatGPT, which was announced during the live stream, still brought unexpected "show effects" to everyone.
Currently, the Sora integrated into ChatGPT is temporarily limited to image generation compared to the standalone application version, but according to OpenAI's introduction during the live stream, this model has made a qualitative leap compared to previous models.
It was introduced that the development team used the capabilities of GPT-4o, a "multimodal" model (capable of generating any type of data such as text, images, audio, and video), as the basis for developing this version of Sora. Therefore, users can directly express their needs, or even upload or take a photo to use as a prompt.
For example, during the live demonstration, they directly took a selfie with Sam Altman and two others using a mobile phone and asked Sora to generate an "anime-style version."
But that's not all; the three of them even demonstrated live how to have Sora add a text segment "Feel The AGI" to the image. They created the first meme of the new version of Sora on site.
This live-generated meme not only had clear and accurate text but also accurately understood essential elements like bold text found in contemporary popular memes, making it ready to be shared in various groups.
Since OpenAI is officially leading the meme creation, many users in the comments section were also inspired to try feeding the same prompt to Grok, using the same prompt and photo to generate similarly styled content ———— but the results were clearly not as good as the new version of Sora, instead bringing about a more humorous effect.
In addition to leading meme creation, OpenAI also demonstrated improvements in text rendering in the new version of Sora, significantly increasing the success rate of generating coherent text on images without spelling errors.
In another demonstration scenario, the OpenAI team had Sora generate a comic card for understanding relativity.
Unlike previous image generation models, where the text generation part often became chaotic or even resulted in "AI-created characters," the new version of Sora's native image generation has no obvious confusion in the generated text, and it even generated very natural and fluent Japanese in the comic, unexpectedly causing quite a stir among many Japanese users in the Japanese community.
For image generation models, correctly rendering text has been a significant challenge in the past. If there are spelling errors or mistakes in the subtitles or text elements, the entire image may become unusable.
Additionally, in this case, OpenAI also demonstrated the correct citation of existing knowledge in the world, such as relativity.
"If I draw an image, I will be limited by my own skills… and by all the world knowledge I have accumulated," Jackie Shannon, head of ChatGPT's multimodal product, explained the necessity of this feature in a media interview.
"The model incorporates world knowledge, so when you request an image of Newton's prism experiment, you don't need to explain what 'Newton's prism experiment' is to get an accurate image."
In addition to the model capability improvements mentioned during the live stream, OpenAI also stated that the new version of Sora significantly enhances the ability to maintain correct relationships between attributes and objects. For example, a model with poor binding ability might generate a red star without a triangle when prompted to generate a blue star and a red triangle.
According to OpenAI, most existing image models are prone to "making mistakes" in this regard, especially when asked to render multiple items (usually around 5 to 8), often confusing colors and shapes. The new version of Sora's image generation function can correctly bind the attributes of 15 to 20 objects while understanding their complex requirements, ensuring it won't be misled, thus greatly increasing the success rate.
In addition to these improvements in user experience, another detail is that OpenAI has confirmed that the new version of Sora takes longer to generate images than before, but OpenAI believes this is a worthwhile trade-off.
"While we definitely have room for improvement in terms of latency… we feel that the quality, functionality, and world knowledge of these generated images indeed make up for the extra few seconds users have to wait," Shannon said.
As for safety issues in the image generation field — which have seen multiple instances of forged celebrity images, false images of hot events, and Google Gemini removing original photo watermarks from last year to this year — the OpenAI team emphasized that the new version of Sora can remove photo watermarks while preventing the generation of deepfake images and refusing to generate related content requests. Additionally, all generated images will include standard C2PA metadata to mark that the image was created by OpenAI.
Currently, the image generation model function of the new version integrated into ChatGPT is open to Pro and Plus subscription users, and OpenAI has promised that the new version of Sora will also be available to free version and API users in the near future.
What I want to do most right now is to have it help me create my own meme immediately.
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。