From Computing Power Competition to Algorithm Innovation: The New Paradigm of AI Led by DeepSeek

Author: BadBot, IOBC Capital

Just last night, DeepSeek released the V3 version update on Hugging Face—DeepSeek-V3-0324, with model parameters of 685 billion, significantly improving code capabilities, UI design, reasoning abilities, and more.

At the recently concluded 2025 GTC conference, Jensen Huang highly praised DeepSeek, emphasizing that the market's previous understanding that DeepSeek's efficient model would reduce demand for NVIDIA chips is incorrect; future computing demands will only increase, not decrease.

As a star product of algorithm breakthroughs, what is the relationship between DeepSeek and NVIDIA's computing power supply? I would like to first discuss the significance of computing power and algorithms for industry development.

The Symbiotic Evolution of Computing Power and Algorithms

In the field of AI, the enhancement of computing power provides a foundation for running more complex algorithms, enabling models to process larger amounts of data and learn more complex patterns; while the optimization of algorithms allows for more efficient utilization of computing power, improving the efficiency of resource use.

The symbiotic relationship between computing power and algorithms is reshaping the AI industry landscape:

Technological Route Differentiation: Companies like OpenAI pursue the construction of ultra-large computing power clusters, while DeepSeek focuses on optimizing algorithm efficiency, forming different technological schools.
Industry Chain Restructuring: NVIDIA has become the dominant player in AI computing power through the CUDA ecosystem, while cloud service providers lower deployment thresholds through elastic computing services.
Resource Allocation Adjustment: Enterprises seek a balance between hardware infrastructure investment and efficient algorithm development in their R&D focus.
Rise of Open Source Communities: Open-source models like DeepSeek and LLaMA allow for the sharing of algorithm innovations and computing power optimization results, accelerating technological iteration and diffusion.

DeepSeek's Technological Innovations

The explosive popularity of DeepSeek is undoubtedly linked to its technological innovations, which I will explain in simple terms for better understanding.

Model Architecture Optimization

DeepSeek employs a combination architecture of Transformer + MOE (Mixture of Experts) and introduces a Multi-Head Latent Attention (MLA) mechanism. This architecture resembles a super team, where the Transformer handles routine tasks, while the MOE acts like a team of experts, each with their own area of expertise. When faced with specific problems, the most skilled expert addresses them, significantly improving the model's efficiency and accuracy. The MLA mechanism allows the model to flexibly focus on different important details when processing information, further enhancing performance.

Training Method Innovation

DeepSeek has proposed an FP8 mixed precision training framework. This framework acts like an intelligent resource allocator, dynamically selecting the appropriate computing precision based on the needs of different stages during training. When high precision is required, it uses a higher precision to ensure model accuracy; when lower precision is acceptable, it reduces precision to save computing resources, increase training speed, and reduce memory usage.

Inference Efficiency Improvement

During the inference phase, DeepSeek introduces Multi-token Prediction (MTP) technology. Traditional inference methods predict one token at a time step by step. In contrast, MTP technology can predict multiple tokens at once, significantly speeding up inference while also reducing costs.

Breakthrough in Reinforcement Learning Algorithms

DeepSeek's new reinforcement learning algorithm GRPO (Generalized Reward-Penalized Optimization) optimizes the model training process. Reinforcement learning acts like a coach for the model, guiding it to learn better behaviors through rewards and penalties. Traditional reinforcement learning algorithms may consume a lot of computing resources during this process, while DeepSeek's new algorithm is more efficient, able to enhance model performance while reducing unnecessary computations, thus achieving a balance between performance and cost.

These innovations are not isolated technical points but form a complete technical system that reduces computing power requirements across the entire chain from training to inference. Ordinary consumer-grade graphics cards can now run powerful AI models, significantly lowering the barriers to AI application development, allowing more developers and enterprises to participate in AI innovation.

Impact on NVIDIA

Many believe that DeepSeek has bypassed the CUDA layer, thus freeing itself from dependence on NVIDIA. In reality, DeepSeek directly optimizes algorithms through NVIDIA's PTX (Parallel Thread Execution) layer. PTX is an intermediate representation language between high-level CUDA code and actual GPU instructions, allowing DeepSeek to achieve finer performance tuning by operating at this layer.

The impact on NVIDIA is twofold: on one hand, DeepSeek is actually more deeply bound to NVIDIA's hardware and the CUDA ecosystem, and the lowering of AI application barriers may expand the overall market size; on the other hand, DeepSeek's algorithm optimization may change the market's demand structure for high-end chips, with some AI models that previously required GPUs like the H100 now potentially running efficiently on A100 or even consumer-grade graphics cards.

Significance for China's AI Industry

DeepSeek's algorithm optimization provides a technological breakthrough path for China's AI industry. In the context of constraints on high-end chips, the idea of "software compensating for hardware" alleviates dependence on top imported chips.

Upstream, efficient algorithms reduce the pressure on computing power demands, allowing computing service providers to extend hardware usage cycles through software optimization and improve return on investment. Downstream, optimized open-source models lower the barriers to AI application development. Many small and medium-sized enterprises can develop competitive applications based on the DeepSeek model without needing large amounts of computing resources, leading to the emergence of more AI solutions in vertical fields.

Profound Impact on Web3 + AI

Decentralized AI Infrastructure

DeepSeek's algorithm optimization provides new momentum for Web3 AI infrastructure. The innovative architecture, efficient algorithms, and lower computing power requirements make decentralized AI inference possible. The MoE architecture is naturally suitable for distributed deployment, where different nodes can hold different expert networks without requiring a single node to store the complete model, significantly reducing storage and computing requirements for a single node, thus enhancing model flexibility and efficiency.

The FP8 training framework further reduces the demand for high-end computing resources, allowing more computing resources to join the node network. This not only lowers the barriers to participating in decentralized AI computing but also enhances the overall computing capacity and efficiency of the network.

Multi-Agent System

Intelligent Trading Strategy Optimization: By coordinating the operation of agents analyzing real-time market data, predicting short-term price fluctuations, executing on-chain trades, and supervising trading results, users can achieve higher returns.
Automation of Smart Contract Execution: Agents monitoring smart contracts, executing smart contracts, and supervising execution results work together to automate more complex business logic.
Personalized Portfolio Management: AI helps users in real-time to find the best staking or liquidity provision opportunities based on their risk preferences, investment goals, and financial situations.

"We can only see a short future, but it is enough to realize that there is much work to be done." DeepSeek is seeking breakthroughs under computing constraints through algorithm innovation, paving a differentiated development path for China's AI industry. Lowering application barriers, promoting the integration of Web3 and AI, reducing dependence on high-end chips, and empowering financial innovation—these impacts are reshaping the digital economy landscape. The future development of AI is no longer just a competition of computing power, but a competition of collaborative optimization between computing power and algorithms. In this new race, innovators like DeepSeek are redefining the rules of the game with Chinese wisdom.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。