Deconstructing the AI Framework: From Intelligent Agents to Decentralized Exploration

Simplifying the agent construction process and providing a framework for complex feature combinations will still hold an advantage in the future, leading to a more interesting Web3 creative economy than the GPT Store.

Author: YBB Capital Researcher Zeke

Deconstructing AI Frameworks: From Intelligent Agents to Decentralized Exploration_aicoin_Image1

Introduction

In previous articles, we have explored various views on the current state of AI memes and the future development of AI agents. However, the rapid narrative development and dramatic evolution of the AI agent track can be somewhat overwhelming. Since the "Truth Terminal" initiated Agent Summer just two months ago, the narrative combining AI and crypto has seen new changes almost every week. Recently, market attention has shifted back to "framework" projects dominated by technological narratives, with several dark horses emerging in this niche track, each surpassing a market cap of over a hundred million or even a billion. Such projects have also given rise to a new asset issuance paradigm, where projects issue tokens based on GitHub repositories, and agents built on these frameworks can also issue tokens. With frameworks as the foundation and agents on top, it resembles an asset issuance platform, but in fact, a unique infrastructure model exclusive to the AI era is emerging. How should we view this new trend? This article will start with an introduction to frameworks and combine personal reflections to interpret what AI frameworks mean for crypto.

1. What is a Framework?

By definition, an AI framework is a foundational development tool or platform that integrates a set of pre-built modules, libraries, and tools, simplifying the process of building complex AI models. These frameworks typically also include functionalities for data processing, model training, and making predictions. In simple terms, you can think of a framework as an operating system in the AI era, similar to Windows or Linux in desktop operating systems, or iOS and Android in mobile devices. Each framework has its own advantages and disadvantages, allowing developers to choose freely based on specific needs.

Although the term "AI framework" is still a nascent concept in the crypto field, its development history has been nearly 14 years since the inception of Theano in 2010. In the traditional AI community, both academia and industry have mature frameworks available for selection, such as Google's TensorFlow, Meta's Pytorch, Baidu's PaddlePaddle, and ByteDance's MagicAnimate, each with its own advantages for different scenarios.

The framework projects emerging in crypto are built in response to the massive demand for agents sparked by this wave of AI enthusiasm, and they have since branched out into other crypto tracks, ultimately forming AI frameworks in various niche fields. Let's take a look at a few mainstream frameworks currently in the space to elaborate on this point.

1.1 Eliza

Deconstructing AI Frameworks: From Intelligent Agents to Decentralized Exploration_aicoin_Image2

First, taking ai16z's Eliza as an example, this framework is a multi-agent simulation framework specifically designed for creating, deploying, and managing autonomous AI agents. Developed using TypeScript as the programming language, its advantage lies in better compatibility and easier API integration.

According to the official documentation, Eliza primarily targets social media scenarios, such as multi-platform integration support. The framework provides a fully functional Discord integration that supports voice channels, automated accounts for X/Twitter, Telegram integration, and direct API access. In terms of media content processing, it supports reading and analyzing PDF documents, extracting and summarizing link content, audio transcription, video content processing, image analysis and description, and dialogue summarization.

The current use cases supported by Eliza mainly fall into four categories:

AI assistant applications: customer support agents, community managers, personal assistants;
Social media roles: automated content creators, interactive bots, brand representatives;
Knowledge workers: research assistants, content analysts, document processors;
Interactive roles: role-playing characters, educational tutors, entertainment bots.

The models currently supported by Eliza include:

Open-source model local inference: such as Llama3, Qwen1.5, BERT;
Cloud inference using OpenAI's API;
Default configuration as Nous Hermes Llama 3.1B;
Integration with Claude for complex queries.

1.2 G.A.M.E

G.A.M.E (Generative Autonomous Multimodal Entities Framework) is an automated generation and management multimodal AI framework launched by Virtual, primarily targeting the design of intelligent NPCs in games. A unique feature of this framework is that it can be used by users with low-code or even no-code backgrounds; based on its trial interface, users only need to modify parameters to participate in agent design.

Deconstructing AI Frameworks: From Intelligent Agents to Decentralized Exploration_aicoin_Image3

In terms of project architecture, G.A.M.E's core design is a modular design where multiple subsystems work together, as detailed in the diagram below.

Deconstructing AI Frameworks: From Intelligent Agents to Decentralized Exploration_aicoin_Image4

Agent Prompting Interface: The interface through which developers interact with the AI framework. Through this interface, developers can initialize a session and specify parameters such as session ID, agent ID, and user ID;
Perception Subsystem: The perception subsystem is responsible for receiving input information, synthesizing it, and sending it to the strategic planning engine. It also handles responses from the dialogue processing module;
Strategic Planning Engine: The strategic planning engine is the core part of the entire framework, divided into a high-level planner and low-level policy. The high-level planner is responsible for setting long-term goals and plans, while the low-level policy translates these plans into specific action steps;
World Context: The world context contains data such as environmental information, world state, and game state, which helps the agent understand the current situation;
Dialogue Processing Module: The dialogue processing module is responsible for handling messages and responses, generating dialogue or reactions as output;
On Chain Wallet Operator: The on-chain wallet operator may involve applications of blockchain technology, with specific functions not clearly defined;
Learning Module: The learning module learns from feedback and updates the agent's knowledge base;
Working Memory: Working memory stores the agent's recent actions, results, and current plans, among other short-term information;
Long Term Memory Processor: The long-term memory processor is responsible for extracting important information about the agent and its working memory, ranking it based on importance, recency, and relevance;
Agent Repository: The agent repository stores attributes such as the agent's goals, reflections, experiences, and personality;
Action Planner: The action planner generates specific action plans based on the low-level policy;
Plan Executor: The plan executor is responsible for executing the action plans generated by the action planner.

Workflow: Developers initiate the agent through the agent prompting interface, the perception subsystem receives input and passes it to the strategic planning engine. The strategic planning engine utilizes information from the memory system, world context, and agent repository to formulate and execute action plans. The learning module continuously monitors the agent's action results and adjusts the agent's behavior based on those results.

Application Scenarios: From the overall technical architecture, this framework primarily focuses on the decision-making, feedback, perception, and personality of agents in virtual environments. Besides gaming, it is also applicable to the Metaverse, and a large number of projects have already adopted this framework for construction, as seen in the list below from Virtual.

1.3 Rig

Deconstructing AI Frameworks: From Intelligent Agents to Decentralized Exploration_aicoin_Image5

Rig is an open-source tool written in Rust, designed to simplify the development of applications using large language models (LLMs). It provides a unified operating interface, allowing developers to easily interact with multiple LLM service providers (such as OpenAI and Anthropic) and various vector databases (like MongoDB and Neo4j).

Core Features:

● Unified Interface: Regardless of which LLM provider or vector storage is used, Rig offers a consistent access method, greatly reducing the complexity of integration work;

● Modular Architecture: The framework employs a modular design, including key components such as the "Provider Abstraction Layer," "Vector Storage Interface," and "Intelligent Agent System," ensuring system flexibility and scalability;

● Type Safety: Utilizing Rust's features, it achieves type-safe embedding operations, ensuring code quality and runtime safety.

● Efficient Performance: Supports asynchronous programming models, optimizing concurrent processing capabilities; built-in logging and monitoring features assist in maintenance and troubleshooting.

Workflow: When a user requests to enter the Rig system, it first passes through the "Provider Abstraction Layer," which standardizes the differences between various providers and ensures consistency in error handling. Next, in the core layer, intelligent agents can call various tools or query vector storage to obtain the required information. Finally, through advanced mechanisms such as Retrieval-Augmented Generation (RAG), the system can combine document retrieval and contextual understanding to generate precise and meaningful responses, which are then returned to the user.

Application Scenarios: Rig is not only suitable for building question-answering systems that require quick and accurate responses but can also be used to create efficient document search tools, context-aware chatbots or virtual assistants, and even support content creation by automatically generating text or other forms of content based on existing data patterns.

1.4 ZerePy

Deconstructing AI Frameworks: From Intelligent Agents to Decentralized Exploration_aicoin_Image6

ZerePy is an open-source framework based on Python, designed to simplify the deployment and management of AI agents on the X (formerly Twitter) platform. It is derived from the Zerebro project, inheriting its core functionalities but designed in a more modular and extensible way. Its goal is to enable developers to easily create personalized AI agents and automate various tasks and content creation on X.

ZerePy provides a command-line interface (CLI) that allows users to manage and control their deployed AI agents. Its core architecture is based on a modular design, allowing developers to flexibly integrate different functional modules, such as:

● LLM Integration: ZerePy supports large language models (LLMs) from OpenAI and Anthropic, allowing developers to choose the model that best fits their application scenario. This enables agents to generate high-quality text content;

● X Platform Integration: The framework directly integrates with the X platform's API, allowing agents to perform actions such as posting, replying, liking, and retweeting;

● Modular Connection System: This system allows developers to easily add support for other social platforms or services, expanding the framework's functionality;

● Memory System (Future Planning): Although the current version may not fully implement this, ZerePy's design goals include integrating a memory system that enables agents to remember previous interactions and contextual information, thereby generating more coherent and personalized content.

While both ZerePy and a16z's Eliza project aim to build and manage AI agents, they differ slightly in architecture and goals. Eliza focuses more on multi-agent simulation and broader AI research, while ZerePy is dedicated to simplifying the deployment of AI agents on a specific social platform (X), leaning more towards practical applications.

2. A Replica of the BTC Ecosystem

In terms of development paths, AI agents share many similarities with the BTC ecosystem at the end of 2023 and the beginning of 2024. The development path of the BTC ecosystem can be simply summarized as: BRC20-Atomical/Rune and other multi-protocol competition-BTC L2-BTCFi centered around Babylon. AI agents, based on a mature traditional AI technology stack, have developed even more rapidly, but their overall development path indeed shares many similarities with the BTC ecosystem, which I summarize as follows: GOAT/ACT-Social type agents/analytical AI agent framework competition. From a trend perspective, infrastructure projects focusing on decentralization and security around agents are likely to inherit this wave of framework enthusiasm, becoming the main theme of the next stage.

Will this track, like the BTC ecosystem, lead to homogenization and bubble formation? I believe not. Firstly, the narrative of AI agents is not aimed at recreating the history of smart contract chains. Secondly, regardless of whether existing AI framework projects are genuinely capable or stagnating at the PPT stage or merely copying and pasting, they at least provide a new infrastructure development approach. Many articles compare AI frameworks to asset issuance platforms, with agents likened to assets. In my view, compared to Memecoin Launchpads and inscription protocols, AI frameworks resemble future public chains more, while agents resemble future DApps.

In today's crypto landscape, we have thousands of public chains and tens of thousands of DApps. Among general chains, we have BTC, Ethereum, and various heterogeneous chains, while application chains take on more diverse forms, such as gaming chains, storage chains, and DEX chains. Public chains correspond to AI frameworks, and both are quite similar, while DApps can correspond well to agents.

In the AI era of crypto, it is highly likely to progress towards this form, and future debates will shift from discussions about EVM and heterogeneous chains to framework disputes. The current question is more about how to decentralize or "chainify." I believe that subsequent AI infrastructure projects will develop on this basis, and another point is what significance it holds to do this on the blockchain.

3. The Significance of On-Chain

Regardless of what blockchain combines with, it ultimately faces a question: Is it meaningful? In last year's article, I criticized the misplaced priorities of GameFi and the premature development of infrastructure. In previous articles about AI, I also expressed skepticism about the current practical applications of AI x Crypto combinations. After all, the driving force of narratives for traditional projects has been weakening, and the few traditional projects that performed well last year generally had to possess the strength to match or exceed their token prices. What can AI do for crypto? Previously, I thought of ideas like agents acting on behalf of users, the Metaverse, and agents as employees—relatively mundane but in-demand concepts. However, these needs do not necessitate being fully on-chain, and from a business logic perspective, they cannot form a closed loop. The agent browser mentioned in the last issue can indeed generate demands for data labeling and reasoning power, but the combination of the two is still not tight enough, and the reasoning power aspect remains dominated by centralized computing.

Deconstructing AI Frameworks: From Intelligent Agents to Decentralized Exploration_aicoin_Image7

Reconsidering the success of DeFi, the reason DeFi could carve out a niche from traditional finance is due to its higher accessibility, better efficiency, lower costs, and the need for trustless centralized security. If we think along these lines, I believe there may be several reasons to support the chainification of agents.

Can the chainification of agents achieve lower usage costs, thereby reaching higher accessibility and choice? Ultimately allowing ordinary users to participate in the "rental rights" of AI that belong exclusively to Web2 giants;
Security: Based on the simplest definition of an agent, an AI that can be called an agent should be able to interact with the virtual or real world. If an agent can intervene in reality or my virtual wallet, then a blockchain-based security solution is also a necessity;
Can agents realize a set of financial gameplay unique to the blockchain? For example, LPs in AMM, allowing ordinary people to participate in automated market-making, or agents requiring computing power, data labeling, etc., where users can invest in the protocol in the form of U when they are optimistic. Alternatively, agents based on different application scenarios could form new financial gameplay;
DeFi currently lacks perfect interoperability. If blockchain-based agents can achieve transparent and traceable reasoning, they may be more attractive than the agent browsers provided by traditional internet giants mentioned in the previous article.

4. Creativity?

Framework projects will also provide an entrepreneurial opportunity similar to the GPT Store in the future. Although currently, releasing an agent through a framework is still quite complex for ordinary users, I believe that simplifying the agent construction process and providing some complex feature combinations will still hold an advantage in the future, leading to a more interesting Web3 creative economy than the GPT Store.

The current GPT Store still leans towards practicality in traditional fields, and most popular apps are created by traditional Web2 companies, with revenue being monopolized by creators. According to OpenAI's official explanation, this strategy only provides funding support to a select few outstanding developers in the U.S., offering a certain amount of subsidies.

From a demand perspective, Web3 still has many areas that need to be filled, and in terms of the economic system, it can also make the unfair policies of Web2 giants more equitable. Additionally, we can naturally introduce community economics to make agents more complete. The creative economy of agents will be an opportunity for ordinary people to participate, and future AI memes will be far more intelligent and interesting than the agents issued on GOAT or Clanker.

References:

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。