This report provides a detailed analysis of the technical details, root causes, and potential attack methods of the core DoS vulnerabilities present in the TON virtual machine, while also showcasing the efficient solutions proposed by the TonBit team.
Recently, the virtual machine system of the TON network underwent a significant security upgrade. The security team TonBit, under BitsLab, successfully discovered and assisted in fixing a core vulnerability that could lead to resource exhaustion in the TON virtual machine. This vulnerability exploits the recursive mechanism of the virtual machine when handling Continuation nesting, which could be abused by malicious contracts, causing system crashes and network instability.
If this vulnerability were maliciously exploited, it could cause all validation nodes to crash without consuming a single TON, directly threatening the network's availability. In this incident, TonBit quickly located the vulnerability thanks to its outstanding technical capabilities and proposed an innovative solution that replaces recursion with iteration by adjusting the internal control flow mechanism of the virtual machine, successfully creating a more secure ecosystem for TON users. The official TON team specifically thanked TonBit for its exceptional contributions to ecosystem security in its latest update announcement.
In the following detailed security report, we will delve into the causes, technical details, and solutions of this vulnerability. The report describes in detail how the vulnerability exploits the deep nesting of Continuation to construct a recursive chain that triggers resource exhaustion attacks, and how malicious contracts can exhaust the host's stack space by extending the call stack. Additionally, we will introduce how the TonBit team eliminated the design flaws of the recursive chain and replaced it with a collaborative iterative mechanism to help completely resolve this issue. This fix not only significantly enhances the stability of the TON network but also provides important references for the underlying security of the blockchain industry.
Case Study: DoS Vulnerability in TON VM and Related Mitigation Measures
Introduction
This report describes a DoS (Denial of Service) vulnerability in the TON virtual machine and the mitigation measures taken to address the issue. The vulnerability arises from the way the virtual machine handles Continuation nesting during contract execution. This vulnerability allows malicious contracts to create Continuations and perform deep nesting in a specific manner, triggering deep recursion during evaluation, exhausting the host's stack space, and causing the virtual machine to stop running. To mitigate this issue, the virtual machine has modified its handling of Continuation and control flow. Now, the virtual machine no longer performs sequential tail calls through a chain of Continuations but actively iterates through the chain. This approach ensures that only a constant amount of host stack space is used, preventing stack overflow.
Overview
According to official documentation, the TON VM is a stack-based virtual machine that uses Continuation-Passing Style (CPS) as its control flow mechanism for internal processes and smart contracts. The control flow register is accessible to contracts, providing flexibility.
Continuations in TVM can theoretically be divided into three categories:
OrdCont (i.e., vmc_std), which contains the TON ASM fragments to be executed and is a first-class object in TVM. Contracts can explicitly create and pass them at runtime to achieve arbitrary control flow.
Extraordinary Continuations, which typically contain OrdCont as components and are created through explicit iterative primitives and special implicit operations to handle corresponding control flow mechanisms.
Additional ArgContExt, which encapsulates other Continuations to preserve control data.
During contract execution, the virtual machine enters a main loop, decoding one word of the contract fragment at a time and dispatching the corresponding operation to the appropriate handler. Ordinary handlers return immediately after executing the corresponding operation.
In contrast, iterative instructions create an extraordinary Continuation using the provided Continuation as a component and jump to the extraordinary Continuation in the appropriate context. The extraordinary Continuation itself implements logic during the jump and conditionally jumps to a component. For example, when using the WHILE instruction, we can demonstrate this process in Figure 1 (with possible exits omitted).
Figure 1: Extraordinary Continuation Logic
Root Cause
In the vulnerable version of the virtual machine, these jumps lead to consecutive dynamic tail calls, requiring the host stack to maintain a stack frame for each jump (as shown in Figure 2).
Taking WhileCont as an example, other parts are omitted for brevity.
Figure 2: Triple Jump Recursion for Deep Nesting
Ideally, this would not pose a problem, as components are typically represented as OrdCont, whose jumps only save the current context and then instruct the virtual machine to execute the fragment it holds before executing the remaining contract fragments, without introducing more recursion. However, extraordinary Continuations are theoretically designed to allow their components to access the cc (c0) register in TVM (i.e., the set_c0 branch mentioned above). Therefore, contracts can abuse this feature to perform deep recursion (described later). It is clearer and easier to eliminate recursion directly during the jump of extraordinary Continuations than to change the implementation of this conventional functionality.
By repeatedly using the obtained extraordinary Continuation to construct a higher-level extraordinary Continuation, a deeply nested Continuation can be created through iteration. These deeply nested Continuations may exhaust the available stack space of the host during evaluation, causing the operating system to issue a SIGSEGV signal and terminate the virtual machine process.
Figure 3 provides a proof of concept (PoC) for the nesting process.
Figure 3: Nesting Process
We see that in each iteration, the body expands a WhileCont{chkcond=true}. By executing the cc generated and saved in the previous iteration, a call stack similar to this is obtained:
It can be seen that the stack space has a linear dependency on the nesting level (i.e., the number of iterations), indicating that it may lead to stack space exhaustion.
Exploitation in Real Environments
In actual blockchains, fuel fee limits make it quite difficult to construct malicious contracts. Due to the linear complexity of the nesting process (the design of TVM effectively prevents cheaper constructions through self-referencing), developing a practically feasible malicious contract is not easy. Specifically, one layer of nesting generates a call sequence that consumes three host stack frames (320 bytes) in the debug binary and two (256 bytes, with the last two calls inlined into one) in the release binary. For validation nodes running on modern POSIX operating systems, the default stack size is 8MiB, which is sufficient to support over 30,000 layers of nesting in the release binary. Although it is still possible to construct a contract that can exhaust stack space, it is much more challenging than the examples in the previous section.
Mitigation Measures
The patch modifies the behavior of jumps in the case of Continuation nesting. We can see that the signature of Continuation jumps has changed.
Taking UntilCont as an example, other parts are omitted for brevity.
The VmState::jump is no longer called to jump to the next Continuation, which means that recursive execution of triple jumps on each Continuation and waiting for return values to propagate backward is no longer necessary. Now, Continuation jumps only resolve the next level of Continuation and return control to the virtual machine.
The virtual machine iteratively resolves each level of Continuation in a collaborative manner until it encounters a NullRef, indicating that the chain's resolution is complete (as implemented in OrdCont or ExuQuitCont). During this iterative process, only one Continuation jump is allocated on the host stack at all times, ensuring that stack usage remains constant.
Conclusion
For services requiring high availability, the use of recursion may become a potential attack vector. Enforcing termination of recursion can be challenging when user-defined logic is involved. This DoS vulnerability demonstrates an extreme case of normal functionality being inadvertently abused under resource-constrained (or other limiting) conditions. If recursion depends on user input, similar issues may arise, which is quite common in the control flow primitives of the virtual machine.
This report provides a detailed analysis of the technical details, root causes, and potential attack methods of the core DoS vulnerabilities present in the TON virtual machine, while also showcasing the efficient solutions proposed by the TonBit team. By adjusting the virtual machine's recursive jump mechanism to iterative processing, TonBit successfully proposed a solution to fix the vulnerability, assisting in the repair of this core vulnerability that could lead to network paralysis, thereby providing a more robust security guarantee for the TON ecosystem. This incident not only reflects TonBit's deep accumulation in the field of blockchain underlying technology security but also showcases its important role as the official Security Assurance Provider (SAP) for TON.
As an indispensable security partner in the TON ecosystem, TonBit has always been at the forefront of protecting the stability of blockchain networks and the security of user assets. From vulnerability discovery to solution design, TonBit has laid a solid foundation for the long-term development of the TON network with its strong technical capabilities and profound understanding of blockchain development. At the same time, the TonBit team continues to make efforts in areas such as network security architecture, user data protection, and enhancing the security of blockchain application scenarios. In the future, TonBit will continue to drive security technology advancements through innovation, providing continuous support and guarantees for the healthy development of the TON ecosystem and the entire blockchain industry. The discovery of this vulnerability and the assistance in its repair have received high recognition from the official TON team, further consolidating TonBit's industry position in blockchain security and demonstrating its firm commitment to promoting the development of decentralized ecosystems.
TonBit Official Website: https://www.tonbit.xyz/
TonBit Official Twitter: https://x.com/tonbit_
Telegram: https://t.me/BitsLabHQ
LinkedIn: https://www.linkedin.com/company/tonbit-team/
Blog: https://www.tonbit.xyz/#blogs
For audit requests on Telegram, contact: @starchou
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。