Before the launch of Dou Bao with high cost performance, many domestic large models such as Tongyi Qianwen, Zhipu AI, and DeepSeek have started to "roll up" prices.
Author: Mu Mu
Large models have also started a price war.
On May 15th, ByteDance's Volcano Engine released the Dou Bao large model. In addition to the Dou Bao app for C-end users being able to use the model for free, the B-end usage price of the Dou Bao large model has been lowered to the industry's lowest level.
According to Tan Dai, the president of Volcano Engine, the pricing of the main Dou Bao model (≤32K) in the enterprise market is only 0.0008 yuan per thousand tokens, and 0.8 fen can process more than 1500 Chinese characters, which is 99.3% cheaper than the industry average.
Before the launch of Dou Bao with high cost performance, many domestic large models such as Tongyi Qianwen, Zhipu AI, and DeepSeek have started to "roll up" prices, and the battle of large models has entered a new stage with collective price reductions. As Tan Dai said, reducing costs is a key factor in driving large models to quickly enter the "value creation stage."
"Dou Bao" Lowers B-end Usage Price to Industry's New Low
The predecessor of the Dou Bao large model was the Yunque large model, which was also the first large model based on the Transformer architecture released by ByteDance in August 2023. Six months later, the Dou Bao large model not only came out with a full set, but also lowered the price for industry B-end users.
The pricing of the main Dou Bao model in the enterprise market is only 0.0008 yuan per thousand tokens, and 0.8 fen can process more than 1500 Chinese characters, which is 99.3% cheaper than the industry average. Based on this calculation, 1 yuan can buy 1.25 million tokens of the main Dou Bao model, which is equivalent to about 2 million Chinese characters, roughly equivalent to three copies of "Romance of the Three Kingdoms." The 128K Dou Bao general model only requires 0.005 yuan per thousand tokens, which is 95.8% lower than the industry price.
It is worth noting that GPT-4 Turbo charges 0.01 USD for 1000 tokens input and 0.21 yuan for 1000 tokens output. In comparison, ByteDance has significantly reduced the price, making it the "Pinduoduo" of the AI industry.
Not only "Dou Bao," many domestic large models have also reduced prices.
Not long ago, Baidu released a lightweight version of the Wenxin large model, with the price of the ERNIE Tiny version reduced to 0.001 yuan per thousand tokens, equivalent to 1 yuan for 1 million tokens.
In May of this year, Zhipu AI's commercial prices for large models also saw a significant reduction. The entry-level product GLM-3 Turbo model's call price was reduced by 80%, from 5 yuan per million tokens to 1 yuan per million tokens, making it affordable for more enterprises and individuals to use this entry-level product.
On May 6th, DeepSeek, an AI company under the well-known domestic private equity giant Huanfang Quantitative, released the all-new second-generation MoE large model DeepSeek-V2, with the DeepSeek-V2 API priced at 1 yuan for every million tokens input and 2 yuan for output (32K context).
On May 9th, Alibaba Cloud officially released Tongyi Qianwen 2.5. According to the evaluation results of OpenCompass, Tongyi Qianwen 2.5 scored the same as GPT-4 Turbo, and at the same time, individual users can use it for free from the app, official website, and mini program.
On May 14th, Tencent's Hybird Life Image large model was directly open-sourced for commercial use.
Overseas, OpenAI's recently released GPT-4o has also seen a significant price reduction, not only available for free to all users, but also halving the API call price compared to the GPT-4-turbo released in November last year, while doubling the speed. This is the third price reduction for OpenAI's large model products.
The input and output prices of Mistral AI's large model Mistral Large are currently about 20% cheaper than GPT-4 Turbo, attracting widespread attention.
Whether domestic or overseas, large models are collectively reducing prices.
Reducing Costs of Large Models Increases Application Efficiency
The "price war" among various manufacturers has begun. However, just half a year ago, it was common knowledge that training large models was very costly. Why, in just half a year, have manufacturers been able to "bring down" prices and start a price war?
Tan Dai, president of Volcano Engine, believes that reducing costs is a key factor in driving large models to quickly enter the "value creation stage." For small and medium-sized enterprise customers, an important consideration for using large models is cost. Tan Dai revealed that ByteDance has many optimization methods at various technical levels such as model structure, training, and production that can achieve price reductions.
OpenAI CEO Sam Altman is proud that people can use ChatGPT without seeing ads, "One of our key missions is to provide AI products to people for free."
Indeed, low prices are helping large model development companies seize market opportunities and gain a foothold. The increase in user volume can also help in turn to train better models. So, has the training cost of large models really decreased?
When GPT-4 was released last year, Sam Altman revealed that the training cost of OpenAI's largest model "far exceeded 50 million USD." According to the "2024 Artificial Intelligence Index Report" released by Stanford University, the training cost of OpenAI's GPT-4 is estimated to be 78 million USD.
The high cost of training large models directly increases usage fees, which directly prevents many enterprise users from using them.
However, researchers are looking for lower-cost training methods. Last year, researchers from the National University of Singapore and Tsinghua University proposed a framework called VPGTrans, which can train high-performance multimodal large models at extremely low cost. Compared to training the visual module from scratch, the VPGTrans framework can reduce the training cost of BLIP-2 FlanT5-XXL from over 19,000 RMB to less than 1,000 RMB.
In domestic large models, researchers have also found ways to reduce costs and increase efficiency in various aspects. After improving the quality of the dataset and optimizing the architecture, DeepSeek-V2, AI heterogeneous computing platform "Baihe" has increased the throughput of training and inference scenarios by 30% and 60%, respectively.
In addition to the training process, some basic infrastructure for training large models—chips—are also becoming cheaper, such as the price reduction of Nvidia's A100 AI chip, which has directly reduced the training cost of large models by about 60%.
The most direct impact of the large model price war is the acceleration of application landing. On the Dou Bao platform, more than 8 million intelligent entities have been created. GPT Store has more than 3 million apps based on GPT models.
In just half a year, the era of spending money to compete for large model performance seems to be a thing of the past. Nowadays, market users are more concerned about which large model is both affordable and easy to use. This will drive the faster implementation of large model applications in scenarios and business.
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。