The current implementation cost of blockchain is too high and the speed is too slow to operate as a universal Web3 computing platform. Successful systems are using real-time off-chain data to find market fit.
Author: Pieter Humphrey, DataStax
Translation: Shanooba, Golden Finance
Currently, Web3 is in a tricky situation, not only because high-profile bad actors have cast a long shadow over the blockchain ecosystem. It is difficult to overcome three major problems without abandoning the principles that initially attracted people to blockchain:
- The cost of on-chain storage and write operations is prohibitively high compared to similar products in Web 2.0.
- To ensure the security promised by blockchain-based systems, on-chain storage and write operations are very slow (by design). As the number of nodes added to the network and the traffic of write requests increase, performance will further deteriorate due to the need for over 51% of nodes to reach consensus on the validity of newly written data.
- The length (size) of any given blockchain ledger will significantly increase with usage, breaking the limitations of most database infrastructures in today's market.
Operating databases, analyzing databases, and distributed ledgers are all effective and different types of database management systems. The emergence of different peer-to-peer blockchain networks is confusing because they are not just "databases"; many of them are also "servers" that can be used to host (serve) Internet applications (or "dApps") written by any capable developer.
Most new technologies go through a phase of overgeneralization until the right product or market is found. The root of these three challenges lies in the same "using the right tool for the wrong job" principle. For example, most IT professionals would not use an operational database as an analytical database, and vice versa. Using a distributed ledger as an operational or analytical database (e.g., in a dApp deployed to a blockchain network) is a particularly bad match, as will be further explained below.
Of course, the blockchain community is researching innovative ways to address performance issues without compromising security, but this takes time. Ethereum has recently made some changes in this regard. It can be said that trust must be placed somewhere. Blockchain shifts this trust out of the traditional Web 2.0 model, but it does not fundamentally eliminate this requirement—at least not today.
Off-chain real-time data provides a direct path for Web3 to find product/market fit. However, this approach finds trust in the form of operational/analytical data of dApps in Web 2.0 systems. Nevertheless, the most successful dApps and blockchain-based services in the market have made this trade-off, using the right tools—the right way of working—to leverage each technology's strengths.
Before delving deeper into how and why Web3 can make progress using real-time data, let's first consider the future prospects of Web3, regardless of how we have just identified these three major challenges.
What will continue to drive Web3 forward?
At such a moment, it is important to remember that blockchain ≠ cryptocurrency. Cryptocurrency is an application of the concepts and foundational technology building blocks of blockchain. The same applies to NFTs and the broader Web3 concept. The core concept of blockchain—immutable public records of transactions, positions, and who owns what—has continuous and interesting differences from the current financial system, where such ledgers reside in private databases and can only be accessed by institutions and regulatory bodies according to institutional and legal rules. There is indeed real-world value and significance for specific use cases. What are they?
According to McKinsey, the largest Web3 lending platforms in 2021 issued $200 billion in loans. Loans, deposits, remittances, asset swaps, trade finance, and insurance have become effective use cases. While peer-to-peer, gaming, social, and online media started early, they have shown significant activity.
Digital identity services, as well as supply chain and logistics management, remain a clear possibility. Speculative use cases in the metaverse are driving real investment dollars, with companies like Facebook transitioning, rebranding as Meta, and going all-in.
Private blockchain systems on closed and protected networks (e.g., Hyperledger Fabric) may not have been what the creators envisioned, but they can now provide more general use cases for specific industries and institutions (at the cost of becoming a Web3 system open to the public). NFTs (non-fungible tokens), or the concept of unique, indivisible, and tamper-proof tokens, have real potential business value in representing real-world and online-only temporary assets in a digital manner.
These are all secure public speculations that make it possible, but they are not yet resolved. Establishing a connection between the real world and digital NFTs, legitimately (physically in some cases), is still under extensive exploration. Web3 provider Alchemy reported a 143% increase in smart contract deployments in this quarter compared to the same quarter in 2021.
Although, like any new idea, there are still significant issues to overcome, the attraction of investment capital, developers, and institutional interest is indeed appealing and can attract the energy that drives blockchain forward. As core technologies mature, more Web3 value will be created. With the generation of more value, new opportunities will arise, sparking interest in addressing regulation, law, data privacy, and better developer and end-user experiences.
Considerations for on-chain data for Web3 developers
The challenges faced by proof-of-work-based blockchain products extend to their underlying architecture. Operational databases are well-suited for fast, efficient data storage and retrieval. Analytical databases are well-suited for fast, open-ended queries and exploration. Non-relational databases provide different levels of operational or analytical functionality at scale without sacrificing performance and availability.
Blockchain-based systems provide a secure, immutable ledger, but at the cost of performance. Attempting to use a secure, append-only immutable ledger as an operational, analytical, or non-relational database will lead to the following issues:
Unacceptable performance
The Web 2.0 technology stack has set the expectation for rapid response digital experiences for most people in the world, whether using a tablet, phone, or desktop/laptop, without needing to wait for two minutes to six hours. Most popular blockchain implementations are based on slow proof-of-work algorithms to protect the write access to blockchain data storage and slow peer-to-peer consensus to ensure data consistency across nodes.
Data volume leading to production interruptions
Blockchain is not just a "big data" problem; it's a "huge" problem. It's massive, incredibly large data that only gets larger with increased usage. Few operational or analytical databases can reach this level, and even fewer databases can truly achieve this level of linear scalability, greatly narrowing the choices.
Contradictory and inaccurate data
The widespread peer-to-peer, eventually consistent design, and proof-of-work nature of blockchain make it secure but produce inconsistent data, making it unsuitable as an operational or analytical database for Web3 applications. Since these issues do not have error messages or fault codes, writing error-handling code to test, interpret, or resolve these errors in an attempt to compensate is time-consuming or impossible. Of course, debugging in production or other critical moments of debugging will be a nightmare for all involved. Downstream technical support will be unable to provide answers to frustrated users, and developers will be unable to provide answers to technical support. Negative app store reviews are imminent.
Unacceptable storage/utilization costs
The cost of on-chain operations is high: 1GB of data on the Ethereum blockchain costs thousands of dollars.
Other Considerations
Indexing or synchronizing blockchain data off-chain is not straightforward because this data is not human-readable. Blockchain data needs to be decoded, enriched, restructured, and data-modeled through third-party data services before it can be easily used by developers.
Solution: Real-time off-chain data synchronization
Popular blockchain network implementations take time to address performance issues in their design. Off-chain processing is the primary technique successful IT professionals use to fully leverage existing database technologies and the advantages of blockchain, using each technology for its best-designed purpose. In simple terms, dApps should read data from off-chain databases and write the data back to the chain (but only recording the minimum level of detail required for the final result of the transaction).
By synchronizing the real-time state of the blockchain to operational or analytical databases, you can ensure the accuracy/monetization of data for dApps to run quickly. Then, when your dApp and off-chain database have done as much preprocessing as possible, submit the final result back to the chain.
Static and binary assets can use systems like IPFS, but for the same reasons, it is wise to consider off-chain object storage (e.g., S3) as much as possible. Therefore, in practice, off-chain databases with a constantly synchronized clone of the chain state should be the read/write target for as many operational or analytical workloads as possible.
But as discussed earlier, the massive volume of data (especially over time) will break most data infrastructures. Apache Cassandra is one of the most powerful operational database systems at this capacity, scale, and performance level.
With the right data model, applications can experience sub-second speeds expected from in-memory caches like Redis and persistent database management systems (DBMS). What if non-relational data services can provide historical data and always up-to-date (real-time) off-chain data?
During the indexing process, the raw data is automatically decoded. This changes the experience of using blockchain data in its original hexadecimal form for developers, as shown below:
For human-readable data, as shown below:
Then, Web3 developers typically need to reorganize and enrich chain data from third-party data services such as Etherscan, Whatsabi, NFT metadata, etc., to be useful for the simplest queries. If the enriched data is subsequently modeled into queryable database tables, developers will have the full functionality of standard DBMS query languages (without having to learn blockchain analysis APIs).
Let's look at an example:
Developer Intent: Search for five entries from block group 134
Actual query code:
System response:
So, what does this look like in practice? To bring it to life, take a look at these two (real-time) example applications that are using such off-chain data services. Web3 developers should be familiar with the application source code; it is written using the popular Web3.js library.
NFT Explorer
- Search for every NFT created within seconds
- Extract the transfer history of an NFT in a single API call
NFT Explorer is built using React and Next JS, providing users with a complete view of NFTs minted or transferred in real-time on the Ethereum blockchain.
Blockchain Explorer
- Pull historical gas prices by block number
- Pull ERC20 transfer amounts by block number
Similar to NFT Explorer, this blockchain data explorer extracts all blockchain data from off-chain and provides users with a real-time view of the latest mined blocks and latest Ethereum transactions.
Providing all of these on hosted cloud services will help overcome traditional reflexes to achieve the usability and time to market of relational DBMS. Building such a service on top of Cassandra uniquely provides the ability to colocate this data with your Web3 application in any region or multi-region without sharding. Cassandra's built-in replication has been battle-tested in the most extreme internet-scale production environments for over a decade.
Advantages for Web3 Applications and Developers
By minimizing dApp size, on-chain data storage, and blockchain writes off-chain, the operational costs for most use cases will readjust to Web 2.0 levels. User performance on their chosen devices will return to acceptable/expected levels. Then, dApp developers can design appropriate "wait time" dialogs, screens, and alerts to set user expectations when submitting write operations to blockchain-based systems.
The biggest, most thorny data consistency issues have been resolved, as most operational data for dApps is stored in a fast, reliable off-chain database. This not only saves hours of frustrating (and possibly fruitless) debugging time but also avoids production errors that may be unsolvable.
As off-chain systems like non-relational databases can handle large volumes of data, your dApp will meet the expected normal runtime and response times as the blockchain grows, without the need for expensive system redesign or complete rewrites months into production. According to the latest Stack Overflow Developer Survey, working with Cassandra—arguably the most reliable, scalable, and fastest non-relational database—is also one of the highest-paying jobs.
Benefits for Enterprises
Broken, slow, or inaccurate applications can lead to irreparable losses in users, revenue, and investor confidence. But let's have the conversation we all want to have—what exciting things might real-time synchronization of blockchain state to off-chain, non-relational infrastructure bring?
- Analyzing dApps: Integrating dApps with off-chain analytical databases opens up the entire prospect of "Web 2.0" options and use cases.
- Fraud detection/prevention capabilities: Building dApps that can expel bad actors or flag/block abuse to protect your user community and your business.
- Authority for digital asset exchange: NFT exchanges require accurate/up-to-date market data to facilitate optimal trading/selling/exchanging. Preventing buyer's remorse when users see the items they purchased at a lower price a few minutes later, as well as resource-intensive refund processes and negative user reviews.
- Location-based functionality: Understanding current location is the foundation of many mobile applications today. Bring it to your dApp!
- Internet of Things applications: Only non-relational databases can handle the speed and volume of machine-generated data from software or hardware without compromise.
- Data sovereignty: For compliance, regulatory, or legal reasons, finding a synchronized replica of the blockchain state using dApps (regardless of their deployment anywhere in the world).
Blockchain transaction parsing time is determined by the protocol, and cannot be accelerated without gas/transaction fees or using accelerator services. By preprocessing off-chain as much as possible, you can minimize the size and frequency of the final results of transactions. This will reduce the cost of on-chain writes for any use case and improve dApp speed.
Try it as a service
This focus on real-time data goes beyond blockchain. This is an area that has been innovating in the industry for over a decade. But technologies like blockchain help demonstrate the importance of real-time data as part of data architecture and business models.
As we wait for quantum cryptography as a service, the proliferation of atomic clocks, and new innovations in distributed consensus algorithms, real-time data can now be obtained through Web 2.0 cost structures. Real-time data will still be a core, fundamental element of any future blockchain implementation.
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。