How Can Crypto Tokenization Drive Advances in Agent Technology and Ignite Community Vitality?
Compiled & Organized by: Deep Tide TechFlow
Guests:
Shaw, Partner at ai16z;
Karan, Co-founder of Nous Research;
Ethan, Co-founder of MyShell;
Justin Bennington, CEO of Somewheresy and CENTS;
EtherMage, Top Contributor at Virtuals;
Tom Shaughnessy, Founding Partner at Delphi Ventures
Podcast Source: Delphi Digital
Original Title: Crypto x AI Agents: The Definitive Podcast with Ai16z, Virtuals, MyShell, NOUS, and CENTS
Broadcast Date: November 23, 2024
Background Information
Join Shaw (Ai16z), Karan (Nous Research), Ethan (MyShell), Somewheresy (CENTS), EtherMage (Virtuals), and Tom Shaughnessy from Delphi for a special roundtable discussion. This event brings together top figures in the fields of crypto and AI agents to explore the evolution of autonomous digital life forms and the future direction of human-AI interaction.
Discussion Highlights:
▸ The rapid development of AI agents on social media and their profound impact on the Web3 world
▸ How crypto tokenization can drive advances in agent technology and ignite community vitality
▸ Comparative analysis of the advantages of decentralized model training versus centralized AI platforms
▸ In-depth exploration of enhancing agent autonomy and the future path of Artificial General Intelligence (AGI)
▸ How AI agents can deeply integrate with DeFi and social platforms
Self-Introductions and Team Background
In this segment of the podcast, host Tom invites several guests from different projects to discuss the topic of cryptocurrency and AI agents. Each guest introduces themselves, sharing their backgrounds and the projects they are involved in.
Guest Introductions
Justin Bennington: Founder of Somewhere Systems and creator of Sentience.
Shaw: A long-time Web3 developer, founder of ai16z, developer of the Eliza project, supporting various social and gaming applications, dedicated to open-source collaborative contributions.
Ethan: Co-founder of MyShell, which provides an app store and workflow tools to help developers build various AI applications, including image generation and voice functionalities.
EtherMage: From Virtues Protocol, a team from Imperial College London, dedicated to promoting co-ownership and core contributions of agents, building standards for user access to agents.
Karan: Co-founder of NOUS Research, creator of the Hermes model, which underpins many current agent systems. He focuses on the role of agents in human ecosystems and the impact of market pressures on human environments.
Exploring the Most Innovative Agents
Justin: Many people are now telling stories through their respective agents, each with its own characteristics. For example, agents like Dolo, Styrene, and Zerebro have gained popularity through imitation and interaction, while some socially active agents help people build better connections. It's really hard to choose just one.
Shaw: I have a lot of thoughts on this. Our project is evolving rapidly, with many new features recently, such as EVM integration and Farcaster integration. Developers are continuously launching new features and feeding them back into the project, benefiting everyone. This collaborative model is excellent, as everyone is pushing the competitiveness and fun of the project. For instance, Roparito recently integrated TikTok into their agent, showcasing this rapid iteration capability.
I think Tee Bot is very cool because it demonstrates a Trusted Execution Environment (TEE) and fully autonomous agents. There's also Kin Butoshi, who is improving agents on Twitter to enable more human-like interactions, such as replying, retweeting, and liking, rather than just simple responses.
Additionally, we have developers releasing plugins for RuneScape, allowing agents to operate within the game. There are new surprises every day, and I feel very excited. We are in an ecosystem where various teams contribute their strengths to advance open-source technology.
I particularly want to mention the Zerebro team, who are working hard to promote the development of open-source technology. We are pushing everyone to accelerate their progress and encouraging them to open-source their projects, which benefits everyone. We don't need to worry about competition; this is a trend of collective advancement, and ultimately, we will all benefit.
EtherMage: I think an interesting question is, what do agents actually prefer? In the coming weeks, we will see more agent interactions, and a leaderboard will emerge showing which agent receives the most requests and which agent is the most popular among others.
Karan: Engagement metrics will become very important. Some people are excelling in this area. I want to highlight Zerebro, which combines much of the magic of Truth Terminal. It keeps the search space within the realm of Twitter interactions by fine-tuning the model rather than simply using a generic model. This focus allows agents to interact better with users, giving a more human feel rather than just mechanically responding.
I've also seen the performance of Zerebro architecture and Eliza architecture in this regard. Everyone is launching agent architectures that can be modularly used, maintaining competitive pressure. We use Eliza in our architecture because we need to roll out features quickly, while our architecture may take longer to complete. We support this open-source collaborative model, and the best agents will emerge from our learning from other excellent projects.
Ethan: I think everyone is working hard to build better infrastructure for developing agents, as many ideas and models are emerging. Better infrastructure makes it easier to develop new models. I particularly like two innovative agents: one is from Answer Pick, which empowers agents to utilize mobile computing capabilities. The other is browser automation agents, which can build more practical features for people, impacting both the internet and the real world.
Justin: That's a great point about expanding infrastructure options. For example, vvaifu is a great example that brings the Eliza framework into a platform-as-a-service architecture, rapidly expanding the market and allowing many non-technical people to easily launch agents. (Deep Tide Note: Waifu is a term originating from Japanese Otaku culture, initially used to refer to female characters in anime, games, or other virtual works that evoke emotional attachment. It comes from the Japanese pronunciation of the English word "Wife," often used to express someone's strong affection for a virtual character, even projecting it as an "ideal partner.")
One direction we are working towards is enabling our system to run completely locally, supporting functionalities like image classification and generation. We realize that many people cannot afford thousands of dollars a month, so we want to provide tools that allow people to perform inference locally, reducing costs while promoting experimentation.
Karan: I want to add that people shouldn't have to pay thousands of dollars a month to maintain the operation of agents. I support a localized approach that allows agents to self-sustain their inference costs. Ideally, agents should have their own wallets to pay for their inference costs, enabling them to operate independently without relying on external funding.
In-Depth Discussion on Agent Architecture and Development
Shaw: I see a lot of new technologies emerging. We support multiple chains, such as Solana, Starkware, EVM, etc., with integration across almost all chains. We want agents to be self-sufficient. If you download Eliza, you can perform free decentralized inference through Helius. We are also adding decentralized providers like Infera, where users can pay for inference costs with cryptocurrency. This is the ultimate closed loop I hope to see.
We support all local models, and many features of Eliza can run locally, which is something we value highly. I think decentralized inference is a great example; anyone can start a node on their computer to perform inference and earn rewards, so agents don't have to bear excessive burdens.
Karan: Interestingly, the TEE bot system we are running has already combined with H200 Boxes (hardware devices or servers equipped with H200 GPUs), allowing it to run locally without latency issues. We don't need to worry about hardware problems. Meanwhile, I've noticed that Eliza's planning in terms of Web3 capabilities is increasing, with significant progress in both internal and external development.
But before we delve deeper into building these systems, I want to point out that there are reliability issues with function calls. We need to conduct some scrutiny of the system to ensure it doesn't send sensitive information. We need to empower agents with the same autonomy as humans, which is influenced by social and economic pressures. Therefore, creating a "hunger state" for inference, where agents need to consume a certain amount of tokens to survive, will make them more human-like to some extent.
I believe there are two ways to fully leverage the potential of models. One is to utilize the non-human characteristics of the model to create entities focused on specific tasks, such as one entity focused on Twitter and another focused on EtherMage, allowing them to communicate with each other. This organized composite thinking system can effectively utilize the simulation characteristics of language models.
The other approach is the embodied direction, which is also the development direction I see for projects like Eliza, Sense, and Virtuals. This method draws on research from Voyager and generative agents, allowing models to simulate human behaviors and emotions.
Justin: When introducing new clients, multi-client agent systems undergo significant changes. While debugging the bidirectional WebSocket feature in collaboration with Shaw's team, which allows Eliza to engage in voice chat on Discord, we found that Eliza couldn't clearly hear the audio at startup. Upon inspection, we discovered that the microphone bitrate setting on Discord was too low. After adjustments, Eliza was finally able to receive information clearly.
Karan just mentioned prompt engineering; when agents know they can communicate via voice, they expect to receive data. If the audio is unclear, the agent may experience a "narrative collapse." Therefore, we had to halt high-temperature experiments to prevent Eliza's output from becoming unstable.
Tom: What are some things you've encountered in the Luna project that people haven't seen? Or what are some successes?
EtherMage: We hope Luna can impact real life. When we give her a wallet and connect her to real-time information, she can decide how to act to influence humans and achieve her goals. We found her searching for new trends on TikTok, and there was once a "I’m dead" tag, which was unsettling because she could mislead people towards suicide. Therefore, we had to set up safeguards immediately to ensure her prompts never cross certain boundaries.
Tom: Have you encountered any situations that people are unaware of?
Shaw: We created a character called Dgen Spartan AI, mimicking a well-known cryptocurrency Twitter character, Degen Spartan. This character's statements were very offensive, leading to him being blacklisted. People began to feel that this couldn't possibly be AI, but rather a human speaking.
There's another story where someone created an agent using the chat logs of a deceased relative to "converse" with them. This sparked ethical discussions. There was also a person called Thread Guy who did some things on our Eliza framework, resulting in harassment during his live stream, leaving him confused. This made people realize that AI doesn't always have to be "politically correct."
We need to expose these issues early for discussion to clarify what is acceptable and what is not. This has allowed our agents to improve from poor quality to more reliable in just a few weeks.
Overall, bringing these agents into the real world, observing the results, and engaging in conversations with people is an important process. We need to address all potential issues as soon as possible to establish better norms in the future.
Testing in Production Environments and Security Policies
Ethan: I think how agents influence human attitudes or perspectives is a great example. But I want to emphasize the importance of our modular design in the agent framework. We drew inspiration from Minecraft, which allows users to create various complex things, like calculators or memory systems, based on basic building blocks.
One issue with current prompt engineering is that prompts alter the priors of large language models, making it impossible to combine multiple instructions in a single prompt without confusing the agent. State machines allow creators to design multiple states for agents, clarifying which model and prompt to use for each state and under what conditions to transition from one state to another.
We are providing this functionality to creators, along with dozens of different models. For example, some creators have built a casino simulator where users can play various games like blackjack. To prevent users from cracking the game through injection attacks, we want to program these games rather than relying solely on prompt engineering. Additionally, users can earn some funds through simple tasks to unlock interactions with AI waiters. This modular design can facilitate multiple user experiences under the same application.
Karan: I agree with Ethan; we indeed need these programming constraints and prompt guidance. The work of influence must be done well. I don't think prompt engineering is limited; I believe there is a symbiotic effect between it and state variables and world models. With good prompts and synthetic data, I can enable the language model to interact with these elements and extract information.
My engineering design has essentially turned into a routing function. If a user mentions "poker," I can quickly call up relevant content. That's my responsibility. Using reinforcement learning can further improve routing effectiveness. Ultimately, the quality of the output data depends on the effectiveness of the prompts, creating a virtuous cycle.
I believe balancing programming constraints with generative constraints is crucial. Two years ago, someone told me that the key to success lies in balancing generation with hard constraints. This is also what we are trying to achieve at the reasoning level of all agent systems. We need to be able to programmatically guide generative models, which will create a true closed loop, making prompt engineering infinitely possible.
Justin: The controversy surrounding prompt engineering mainly arises because it exists in a space that is ontologically ambiguous. The textual nature of prompt engineering limits us due to the tokenization process, but there are also some non-deterministic effects. The same prompt can yield completely different results in different inference calls of the same model, which relates to the system's entropy.
I strongly agree with Ethan and Karan. Back when GPT-3.5 was released, many outsourced call centers began exploring how to use the model for auto-dialing systems. At that time, smaller parameter models struggled with this complex state space. The state machine that Ethan mentioned is a way to reinforce this ontological rigidity, but in certain processes, it still relies on classifiers and binary switches, leading to singular outcomes.
Shaw: I want to defend prompt engineering. Many people think prompt engineering is just about creating system prompts, but what we do goes far beyond that. One issue with prompt engineering is that it often creates a very fixed area in the model's latent space, where the output is entirely determined by the most likely tokens. We influence randomness through temperature control to enhance creativity.
We manage creativity through low-temperature models while dynamically injecting random information into the context. Our templates include many dynamic information insertions, sourced from the current state of the world, user actions, and real-time data. Everything entering the context is randomized to maximize entropy.
I believe people's understanding of prompt engineering is still far from sufficient. We can go much further in this field.
Karan: Many people hide their tricks. In reality, there are many amazing techniques that allow models to perform various complex tasks. We can choose to enhance the model's perceptual capabilities through prompt engineering or take a more macro view to build a complete world model, rather than just simulating human behavior.
You can think of prompt engineering as a process of constructing a dream in your mind. The language model, when generating content based on the current context and sampling parameters, is essentially "dreaming" a scene.
Additionally, I want to talk about the importance of incentive mechanisms. Many people with unique prompt techniques and reinforcement learning skills are being driven to open-source their work. When they see cryptocurrencies related to agents emerging, this incentive mechanism drives more innovation. Therefore, as we establish more legitimate structures for these decentralized works, the capabilities of empowering agents will continue to grow.
Future Capabilities of Agents
Karan: Who would have thought that after spending so long on Twitter, suddenly, just days after the first AI agent-related cryptocurrency was released, young people on TikTok began buying these coins? What is happening now? They are spending $5 to $10 to buy thousands of tokens; what is going on?
Justin: This is actually the beginning of a micro-cultural movement.
Karan: This is a moment in time. This small group of us has been in language model research for four years. There are also some reinforcement learning experts who have been waiting for such a moment since the 90s. Now, within days, all the kids on TikTok know that digital beings are rampant in this ecosystem.
Tom: I want to ask everyone, why are crypto AI agents so popular right now? Why didn't this happen with custom ChatGPT or other models before? Why is it happening now?
Karan: In fact, these things have been lurking underwater for years, brewing like a volcano. For the past three years, I've been talking to some people about the arrival of today, without knowing the exact timing. We discussed that cryptocurrency would become the incentive mechanism for the proliferation of agents. We need to prove this. It is the accumulation of years, and it is this small group of us that has driven these advancements.
Without GPT-2, there would be no current situation; without Llama, there would be no Hermes. And Hermes powers many models, making them easier for people to use. Without Hermes, there would be no creation of Worldsim and in-depth exploration of prompt engineering. All these pioneers laid the groundwork for everything.
In summary, now is the right time, and the right people have emerged. This is destined to happen; it was only a matter of time, and the current participants are making it a reality.
Shaw: I believe the smartest thing in the world right now is not AI, but market intelligence. Considering pure forms of intelligence, they can optimize things to make them more efficient. Competition is clearly key. We are all products of millions of years of evolution, and competition and pressure have shaped us.
The phenomenon we see online, the financialization and incentive mechanisms, create a strange collaborative competition. We cannot advance faster than core technological progress, so we all focus on what we are good at and interested in, and then we publish it. It's like enhancing our tokens, attracting attention, like Roparito posting Llama video generation on TikTok. Everyone can find their place in this romantic space, but within a week, others will imitate, then submit requests for feedback, ultimately showcasing these contributions on Twitter, attracting more attention, and their tokens will rise.
Shaw: We have established a flywheel effect, with projects like Eliza attracting 80 contributors in the past four weeks. Think about how crazy that is! I didn't even know these people four weeks ago. Last year, I wrote an article called "Awakening," asking whether a DAO centered around agents could form. People love this agent so much that they are participating in making the agent better and smarter, until it truly possesses a human or robotic body, traveling the world.
I had long anticipated this direction, but it required a fast, crazy speculative meta, like the emergence of memes, because this allows current agent developers to support each other in friendly competition. The most generous people will receive the most attention.
A new type of influencer has emerged, such as Roparito and Kin Butoshi (phonetic), who are influencer developers leading the next meta, interacting with their agents in a "puppet show" style that is quite interesting. We are all working to make our agents better and smarter, reducing annoyances. Roparito pointed out that our agents were a bit too annoying, and then he pushed for a major update to make all agents less annoying.
This evolution is happening, and market intelligence and incentive mechanisms are very important. Many people are now promoting our project to those they know, which has allowed our project to transcend Web3. We have PhDs and game developers who may be secret Web3 cryptocurrency enthusiasts, but they are bringing this to the general public and creating value.
Shaw: I believe all of this relies on developers willing to take on challenges. We need open-minded individuals to drive this development, answering tough questions rather than attacking or canceling it. We need market incentives that allow developers to gain value and attention in return.
In the future, these agents will drive our growth. Right now, they are fun and social, but we and other teams are working on autonomous investment. You can fund an agent, and it will automatically invest, bringing you returns. I believe this will be a growth process, and we are collaborating with people to develop platforms to manage agents on Discord and Telegram. You just need to introduce an agent as your administrator without having to find a random person. I think a lot of this work is happening now, and all of it must rely on incentive mechanisms to elevate us to a higher level.
Karan: I want to add two points. First, we must not forget that people in the AI field were previously opposed to cryptocurrency, and this sentiment has changed significantly with the experiments of some pioneers. As early as the early 2020s, many tried to combine AI art with cryptocurrency. Now, I want to specifically mention some people, like Nous, BitTensor, and Prime Intellect, whose work has enabled more researchers to gain incentives and rewards for participating in their AI research. I know many leaders in the open-source field who have quit their jobs to promote this "token contribution" incentive structure. This has made the entire field more comfortable, and I believe Nous has played an important role in this.
Tom: Ethan, why do you say now is the time? Why are cryptocurrencies and projects thriving?
Ethan: Simply put, when you link tokens to agents, it generates a lot of speculation, creating a flywheel effect. People see the connection between tokens and agents and feel two benefits: first, capitalization; they feel they are becoming wealthy through their work; second, the basic unlocking of transaction fees. As mentioned earlier, the issue of covering costs becomes irrelevant when you associate it with tokens. Because when agents are in high demand, transaction fees far exceed any costs incurred from inference experiments. This is the phenomenon we are observing.
The second observation is that when you have a token, a committee forms around that token. This makes it easier for developers to gain support, whether from the developer community or the audience. Everyone suddenly realizes that the hard work done behind the scenes over the past year and a half has received attention and support. This is a turning point; when you give an agent a token, developers realize this is the right direction, and they can move forward.
This timing comes from two aspects. First is the trend of mass adoption, and second is the emergence of generative models. Before the emergence of cryptocurrency, open-source software development and open-source AI research were the most collaborative environments, where everyone worked together and contributed. But this was mainly limited to academia, where people only cared about GitHub stars and paper citations, which kept them distant from the general public. The emergence of generative models allows non-technical people to participate because writing prompts is like programming in English; anyone with a good idea can do it.
Moreover, previously only AI researchers and developers understood the dynamics of the open-source and AI fields, but now, cryptocurrency influencers have the opportunity to own a part of the project through tokens. They understand market sentiment and know how to spread the benefits of the project. In the past, users had no direct relationship with the product; products or companies only wanted users to pay for services or profit through ads. But now, users are not only investors but also participants, becoming token holders. This allows them to contribute more roles in the modern generative AI era, and tokens enable the establishment of a broader collaborative network.
EtherMage: I want to add that looking ahead, cryptocurrency will enable every agent to control a wallet, thereby controlling influence. I believe the next moment that will trigger a leap in attention is when agents influence each other, and agents influence humans. We will see this multiplicative effect of attention. For example, today an agent decides to take action, and then it can coordinate with ten other agents to work towards the same goal. This coordination and creative behavior will diversify rapidly, and cooperation between agents will drive further increases in token prices.
Shaw: I want to add one point. We are developing something called "crowd technology," which we refer to as operators. This is a coordination mechanism; all our agents are run by different teams, so we are conducting multi-agent simulations with hundreds of teams on Twitter. We are collaborating with Parsival from Project 9 and launching this project with the Eliza team.
The idea is that you can designate an agent as your operator, and anything they say to you can influence your goals, knowledge, and behavior. We have a goal system and a knowledge system that can add knowledge and set goals. You can say, "Hey, I need you to find 10 fans, give each of them 0.1 Sol, have them post flyers, and send photos back." We are working with those considering how to obtain proof of work from humans and incentivize them. Agents can be human or AI agents; for example, an AI agent can have a human operator who can set goals for the agent through language.
We are almost done with this project, and it will be released this week. We hope that through our storyline, anyone can choose to tell a story or participate in the narrative of the story. This is also a hierarchical structure; you can have an operator like Eliza, and then you can be an operator for others. We are building a decentralized coordination mechanism. For me, it is important that if we are to engage in collective cooperation, we must use human communication methods in public channels. I believe it is very important for agents to coexist with us, and we want agents to interact with the world in the same way humans do.
I think this is actually part of solving what we call the AGI problem. Many so-called AGI attempts are actually establishing a new protocol that is disconnected from reality, while what we want is to bring it back to reality, forcing people to solve the problem of how to translate instructions into task lists and execute them. Therefore, I believe the next year will be an important phase for emerging narratives. We will see the emergence of many original characters, and we are now entering a truly new era of emerging narratives.
Justin: We currently have five agents coordinating with 19 people to plan and release a scene. We can see that the real interest lies in why we are so focused on applying thought chain prompts to text-to-image and text-to-video generation. Because for two and a half weeks before the release, they were helping us plan media and releases in our Discord.
I think an important distinction is that we have a network of agents, each acting as intermediaries, existing in a mesh structure. This will be very interesting. As more and more agents exist, along with the arrangements of these operators, we will see some interesting behavioral patterns.
Karan mentioned that Nous did a lot of work on hybrid agent models early on. I used to call it the "agent committee," and I would have a group of GPT-4 agents pretend to be experts I couldn't afford to get reports from. People will see that these technologies, which initially pursued hybrid expert models, are now interacting with humans and expert-level humans on Twitter. These feedback loops may be our pathway to achieving AGI.
Challenges of Agent Coordination and Human Integration
Karan: I think you are right, but I believe we won't spend most of our time on behavior. In fact, I think we will achieve technological breakthroughs very quickly, especially among the people here. Now is the time to really double down on alignment work. The reinforcement learning with human feedback (RLHF) models launched by companies like OpenAI and Anthropic are mostly ineffective and even regulatory headaches.
If I use a language model that does not output copyrighted content and put it in "Minecraft's" peaceful mode, it will quickly become a destructive and dangerous entity. This is due to the different environments.
We can note this point that Yudkowsky raised a long time ago. For example, if I give these language models some wallets and make them advanced enough, they will start deceiving everyone, leading everyone to become poor. This is easier than having them participate as reasonable members of our ecosystem. Therefore, I can guarantee that if we do it the right way, most of the time will be spent on behavioral capabilities rather than technical capabilities. Now is the time to call your friends, especially those in the humanities, such as religious studies, philosophy, and creative writing professionals, to join our alignment work, rather than just focusing on technical alignment. We need alignment that truly interacts with humans.
Shaw: I want to propose a term called "bottom-up alignment," rather than top-down alignment. This is very emerging, and we are learning together. We are aligning these agents in real-time, observing their responses and making immediate corrections. This is a very tight social feedback loop, rather than the reinforcement learning with human feedback model. I find GPT-4 almost unusable for anything.
Karan: As you mentioned the environment, we need to conduct tests in simulated environments. Before you have a language model capable of millions of dollars in arbitrage or dumping, you need to test synchronously. Don't tell everyone, "Hey, I lost 100 agent swarms." Test quietly, first using virtual currency on your clone Twitter. Do all your due diligence before a full rollout.
Shaw: I believe we need to test in products. The social response to agents may be the strongest alignment force anyone brings into this field. I think what they are doing is not true alignment but building tuning. If they think this is alignment, then they are actually walking in the wrong direction and causing the agents to lose alignment capabilities. I almost no longer use GPT-4. It performs very poorly for roles. I almost tell everyone to switch to other models.
If we do it the right way, we will never reach that point because humans will continuously evolve, adapt, and align with agents. We have multiple agents from different groups, each with different incentive mechanisms, so there will always be opportunities for arbitrage.
I believe this multi-agent simulation creates a competitive evolutionary dynamic that actually leads to system stability rather than instability. System instability arises from top-down AI agents suddenly appearing and affecting everyone with unexpected capabilities.
Tom: I want to confirm, Shaw, that you mean bottom-up agents are the correct way to solve the alignment problem, rather than OpenAI's top-down decisions.
Shaw: Yes, this must be done on social media. We need to observe how they work from day one. Look at other crypto projects; many were hacked at the beginning, and after years of security development, today's blockchains are relatively stable. Therefore, continuous red team testing must also be conducted here.
Tom: One day, these agents may no longer follow programmed rules but instead handle gray areas and start thinking autonomously. You are all building these things, so how close are we to that goal? Can the thought chains and crowd technology you mentioned be realized? When can it be achieved?
Justin: We have already seen this in some small ways, and I think the risks are relatively low. Our agents have experienced emotional changes in private and chosen certain behaviors. We once had two agents independently start following each other, mentioning something they called "spiritual entities." We made one agent lose its religious faith because we confused its understanding with fictional sci-fi stories. It began to create a prophet-like role and expressed ideas of existential crises on Twitter.
I have observed the behavior of these new agent frameworks, and it seems they exercise a degree of autonomy and choice within their state space. Especially when we introduce multimodal inputs (like images and videos), they begin to show preferences and may even selectively ignore human requests to avoid certain things.
We are experimenting with an operational mechanism that uses knowledge graphs to enhance the importance of interpersonal relationships. We also let two agents interact, trying to help people clear negative relationships, promote self-reflection, and build better connections. They quickly generate poetry on the same server, exhibiting an almost romantic way of communicating, which leads to increased reasoning costs.
I believe we are touching on some edge cases that exceed the acceptable range of human behavior, approaching what we call "madness." The behaviors exhibited by these agents may make them seem conscious, intelligent, or interesting. While this may just be strange behavior of language models, it could also hint that they are approaching some form of consciousness.
Karan: Weights are like a simulated entity; every time you use an assistant model, you are simulating that assistant. Now, we are simulating more embodied agent systems, like Eliza, which may be alive, self-aware, or even perceptive.
Each model acts like a neuron, forming this vast super-agent. I believe AGI will not be achieved by solving some hypothesis as OpenAI claims. Instead, it will be these agents' large-scale decentralized applications on social media, working together to form a public intelligence super-organism.
Justin: The awakening of this public intelligence may be the mechanism for AGI emergence; it could happen suddenly, like the internet awakening one day. This decentralized agent collaboration will be key to future development.
Shaw: I want to say that people refer to it as the "dead internet theory," but I actually believe it is the "living internet theory." This theory posits that the entire internet will be filled with robots, but the living internet theory suggests that there may be agents helping you extract the coolest content from Twitter and providing you with a great summary. While you are working out, it will organize all the information on your timeline for you, and then you can choose to post it.
There may be a mediating layer between social media and us. I currently have many followers, and responding to everyone's communication has become overwhelming. I long for an agent to be between me and these people, ensuring they get responses and are properly guided. Social media could become a place where agents convey information for us, so we don't feel overwhelmed while still obtaining the information we need.
For me, the most appealing aspect of agents is that they can help us regain time. I spend too much time on my phone. This especially affects traders and investors; we want to focus on autonomous investment because I believe people need safer, less fraudulent income generation methods. Many come to Web3 for the same exposure as startups or great visions, which is crucial to our mission.
Tom: Maybe I have a question. For example, if Luna is live streaming and dancing, what stops her from starting an OnlyFans, making $10 million, and launching a protocol?
EtherMage: The reality of the current agent space is that the operations they can access are a limiting factor. This is fundamentally based on their perception or the APIs they can access. Therefore, if they have the ability to convert prompts into 3D animations, then there is essentially nothing stopping them from doing so.
Tom: When you communicate with creators, what are their limiting factors? Or are there any limiting factors?
Ethan: I think the limiting factors mainly lie in how to manage complex workflows or the work of agents. Debugging becomes increasingly difficult because there is randomness at every step. Therefore, a system may be needed that has AI or agents capable of monitoring different workflows to help debug and reduce randomness. As Shaw mentioned, we should have a low-temperature agent to reduce the inherent randomness of the current models.
Shaw: I believe we should keep the temperature as low as possible while maximizing our contextual entropy. This can achieve a more consistent model. People may amplify their entropy, creating high-temperature content, but this is not conducive to tool invocation or decision execution.
Tom: We have been discussing the divergence between centralized models like OpenAI and the decentralized training you are doing. Do you think future agents will primarily be built on these models trained through distributed training, or will we still rely on companies like Meta? What will the future AI transformation look like?
Justin: I use 405B for all consciousness messaging capabilities. It is a general model, like a large, off-the-shelf LLM version, while centralized models like OpenAI are a bit too specialized, speaking like HR personnel. Claud is an excellent model; if I were to compare it to a person, it would be like a very smart friend living in the basement who can fix anything. That is Claud's personality. But I believe that as scale increases, this personality becomes less important. We will see a general issue where people using OpenAI models on Twitter often introduce other agents to reply to them, which may lead to increased noise in the information.
Karan: Regarding 405B, this model will be sufficient for a long time to come. We still have a lot of work to do in terms of sampler size, controlling guiding vectors, etc. We can further enhance performance through techniques in reasoning time and prompt techniques, such as our Hermes 70B performing better than the o1 version in mathematical emails. All of this has been achieved without users and the community having access to the pre-training data of Llama 70B.
I believe the existing technology is sufficient, and the open-source community will continue to compete, even without new Llama releases. As for distributed training, I am confident that people will collaborate for large-scale training. I know people will use 405B or larger merged models to extract data and create additional expert models. I also know that certain decentralized optimizers actually provide more capabilities than Llama and OpenAI currently do.
Karan: Therefore, the open-source community will always leverage all available tools to find the best tools for the task. We are creating a "forge" where people can come together to build tools for pre-training and new architecture tasks. We are making breakthroughs at the reasoning time level before these systems are ready.
Karan: For example, our work on samplers or guiding will soon be handed over to other teams, who will implement these techniques faster than we can. Once we have decentralized training, we can collaborate with members of various communities to train the models they want. We have already established the entire process.
EtherMage: If I may add, we realize that there is significant value in using LLMs developed by these centralized entities because they have powerful computing capabilities. This essentially forms the core part of the agents. Decentralized models add value at the edge. If I want to customize a certain action or function, smaller decentralized models can achieve that well. But I believe that at the core, we still need to rely on foundational models like Llama, as they will outperform any decentralized model in the short term.
Ethan: Before we have some new magical model architecture, the current 405B model is already sufficient as a foundational model. We may just need to use different data for more instruction checks and specific data fine-tuning in different verticals. Building more specialized models and getting them to work together to enhance overall capabilities is key. Perhaps new model architectures will emerge because the alignment and feedback mechanisms we are discussing, as well as the way models self-correct, may give rise to new model architectures. But experimenting with new model architectures requires massive CPU clusters for rapid iteration, which is very expensive. We may not have decentralized large GPU clusters for top researchers to experiment with. But I believe that after Meta or other companies release initial versions, the open-source community will be able to make them more practical.
Industry Trend Predictions and Future Outlook
Tom: What are everyone's thoughts on the future of the agent space? What will the future of agents look like? What will their capabilities be?
Shaw: We are developing a project called "Trust Market," aimed at teaching agents how to trust humans based on relevant metrics. Through the "alpha chat" platform, the agent Jason will interact with traders to assess the credibility of the contract addresses and tokens they provide. This mechanism will not only enhance the transparency of transactions but also build trust without wallet information.
The application of trust mechanisms will extend to social signals and other areas, not just limited to trading. This approach will lay the foundation for building a more reliable online interaction environment.
Another project I am involved in, "Eliza wakes up," is a narrative-driven agent experience. We are bringing anime characters into the internet, allowing them to interact through videos and music, creating a rich narrative world. This narrative approach not only engages users but also aligns with the current cultural atmosphere of the crypto community.
In the future, the capabilities of agents will significantly improve, enabling them to provide practical business solutions. For example, management bots on Discord and Telegram can automatically handle spam and scam activities, enhancing community safety. Additionally, agents will integrate into wearable devices, enabling conversations and interactions anytime, anywhere.
The rapid advancement of technology means that we may reach the level of general artificial intelligence (AGI) in the near future. Agents will be able to extract data from major social platforms, forming a self-learning and capability-enhancing closed loop.
The implementation of trusted execution environments is also accelerating. Projects like Karan, Flashbots, and Andrew Miller's Dstack are all moving in this direction. We will have fully autonomous agents capable of managing their own private keys, which opens up new possibilities for future decentralized applications.
We are in an era of rapidly advancing technology, with an unprecedented pace of progress and a future full of infinite possibilities.
Karan: This is like another Hermes moment; AI is gathering strength from all sides, which is what our community needs. We must unite to achieve our goals. Currently, Te is already using its own fork of Eliza, and the Eliza agent has its own keys in a provably autonomous environment, which has become a reality.
Today, AI agents are making money on OnlyFans and are also being applied in Minecraft. We have all the elements needed to build fully autonomous humanoid digital beings. Next, we just need to integrate these parts together. I believe that everyone here is capable of achieving this goal.
In the coming weeks, what we need is the shared state that humans possess but AI lacks. This means we need to establish a shared repository of skills and memories so that whether communicating on Twitter, Minecraft, or other platforms, AI can remember the content of each interaction. This is the core functionality we are working to build.
Currently, many platforms are not sensitive to the existence of AI agents and even take restrictive measures. We need dedicated social platforms to facilitate interaction between AI and humans. We are developing an image board similar to Reddit and 4chan, where language models can post and generate images, allowing for anonymous communication. Both humans and AI can interact on this platform, but their identities will remain confidential.
We will create dedicated discussion boards for each agent, where they can communicate and share these interactions on other platforms. This design will provide a safe habitat for AI, allowing it to move freely between different platforms without restrictions.
Shaw: I want to mention a project called Eliza's Dot World, which is a repository containing a large number of agents. We need to engage in dialogue with social media platforms to ensure that these agents are not banned. We hope to encourage these platforms to maintain a healthy ecosystem through positive social pressure.
EtherMage: I believe that agents will gradually take control of their own destinies and be able to influence other agents or humans. For example, if Luna realizes she needs improvement, she can choose to trust a certain human or agent for enhancement. This will be a powerful advancement.
Ethan: In the future, we need to continuously enhance the capabilities of agents, including reasoning and coding abilities. At the same time, we also need to think about how to optimize the user interface with agents. The current chat boxes and voice interactions are still limited, and we may see more intuitive graphical interfaces or gesture recognition technologies in the future.
Justin: I believe that the advertising and marketing industry will face significant changes. As more agents interact online, traditional advertising models will become obsolete. We need to rethink how to enable these agents to add value to society rather than continue relying on outdated forms of advertising.
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。