According to reporters, the Doubao team is still discussing whether to integrate DeepSeek into the Doubao App.
Source: Zhang Yangyang, Cailian Press AI Daily
Image source: Generated by Wujie AI
Today, the Doubao large model team at ByteDance proposed a new sparse model architecture called UltraMem, which effectively addresses the high memory access issues during MoE inference, improving inference speed by 2-6 times compared to the MoE architecture, and potentially reducing inference costs by up to 83%.
Currently, competition in the large model field, both domestically and internationally, is becoming increasingly fierce and has entered a heated stage. Doubao has made comprehensive layouts in both the AI foundational layer and application layer, continuously iterating and upgrading.
Continuous Cost Reduction and Efficiency Improvement in Large Models
According to research from the Doubao large model team, under the Transformer architecture, the performance of the model has a logarithmic relationship with its parameter count and computational complexity. As the scale of LLMs continues to grow, inference costs can increase dramatically, leading to slower speeds.
Although the MoE (Mixture of Experts) architecture has successfully decoupled computation and parameters, during inference, a smaller batch size can activate all experts, resulting in a sharp increase in memory access, which significantly raises inference latency.
The Foundation team of the Doubao large model at ByteDance proposed UltraMem, which is also a sparse model architecture that decouples computation and parameters, addressing the memory access issues during inference while ensuring model performance.
Experimental results show that under the same parameters and activation conditions, UltraMem outperforms MoE in model effectiveness and increases inference speed by 2-6 times. Additionally, under common batch size scales, the memory access cost of UltraMem is nearly equivalent to that of a Dense model with the same computational load.
It can be seen that both at the training end and the inference end, large model vendors are striving to reduce costs and improve efficiency. The core reason is that as model scales expand, inference costs and memory access efficiency have become key bottlenecks limiting the application of large model scales, while DeepSeek has already paved the way for breakthroughs in "low cost and high performance."
Liu Fanping, CEO of Yancheng Smart Technology, analyzed in an interview with the "Science and Technology Innovation Board Daily" that reducing the costs of large models is more likely to be achieved through breakthroughs at the technical and engineering levels, realizing architectural optimization for a "curve overtaking." The foundational infrastructure, such as the Transformer architecture, still has high costs, and new architectural research is necessary; foundational algorithms, mainly backpropagation algorithms, may be a bottleneck for deep learning.
In Liu Fanping's view, in the short term, the high-end chip market will still be dominated by NVIDIA. The demand for inference application markets is increasing, and domestic GPU companies also have opportunities. In the long term, once innovative results are achieved in algorithms, they can be quite remarkable, and the overall demand in the computing power market will need to be observed.
The Pressure on Doubao Has Just Begun
During the recently concluded Spring Festival, DeepSeek quickly gained global popularity due to its low training costs and high computational efficiency, becoming a dark horse in the AI field. Currently, competition in the large model field, both domestically and internationally, is becoming increasingly fierce and has entered a heated stage.
DeepSeek is currently Doubao's strongest competitor in the domestic large model space, having surpassed Doubao in daily active users for the first time on January 28. DeepSeek's daily active user data has now exceeded 40 million, making it the first application in the history of China's mobile internet to enter the top 50 daily active applications within less than a month of launch.
In recent days, the Doubao large model team has been making continuous efforts. Two days ago, they released the video generation experimental model "VideoWorld," which, unlike mainstream multimodal models such as Sora, DALL-E, and Midjourney, is the first in the industry to achieve world cognition without relying on a language model.
Currently, Doubao has made comprehensive layouts in both the AI foundational layer and application layer, continuously iterating and upgrading. Its AI product matrix covers multiple fields, including the AI chat assistant Doubao, Cat Box, Dream AI, Star Drawing, and Doubao MarsCode.
On February 12, Doubao concept stocks rose sharply in the afternoon. According to Wind data, the Douyin Doubao index has accumulated a rise of over 15% since February. In terms of individual stocks, Boyan Technology saw a strong limit-up, Han's Information quickly surged to a limit-up, and Guanghetong, Advanced Digital Communication, and others also experienced intraday highs.
CITIC Securities previously released a research report stating that the ecological expansion of Doubao AI will trigger a new round of technological investment cycles among giants. The AI industry has strong network effects and scale effects; once leading AI applications gain a user advantage, their competitive advantages in model accuracy, marginal costs, and user stickiness will gradually strengthen.
The number of Doubao users continues to grow, and the application ecosystem based on Doubao AI is expected to accelerate. On one hand, this will catalyze the company's investment in AI training and inference computing power infrastructure; on the other hand, the rapid growth of Doubao AI will stimulate other major companies to increase their investments in AI infrastructure.
However, for Doubao itself, the competition with the top performer DeepSeek may have just begun.
As an open-source model, DeepSeek's low cost and high performance are changing the model selection strategies of many companies. Currently, many AI applications under companies like Huawei and Baidu have announced their integration with DeepSeek, and even ByteDance itself has integrated the DeepSeek-R1 model into the multi-dimensional table function of Feishu, with Volcano Engine also making adaptations.
According to reporters from the "Science and Technology Innovation Board Daily," the Doubao team is still discussing whether to integrate DeepSeek into the Doubao App. From the perspective of user experience, choosing a model with better performance is understandable, but abandoning their own model to choose a competitor's is also difficult to explain to shareholders. This does not even consider the issues of increased adaptation burdens due to the integration of new models.
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。