The BlueHeart Little V from the vivo X100 series is the most "fitting" large language model I have ever used.

CN
巴比特
Follow
1 year ago

Source: Silicon Position

Author | Luo Yihang

Image Source: Generated by Wujie AI

This is the first large language model developed independently by a smartphone manufacturer that I have experienced - "Lanxin Xiaov" based on the Lanxin large model launched by vivo. As a "large model farmer," before testing any model service recently, I always remind myself to "lower my expectations," especially for those demos with overly cool video shooting. But for the Lanxin large model, my feeling is: it meets expectations. It's not flashy, but it's practical.

As a large language model launched by a smartphone manufacturer, people usually think that it will not be too large, the parameters will be low, and the emergence effect may not be too outstanding, and there may be bugs in understanding some complex texts and intentions. But my experience with the Lanxin large model is the opposite: it demonstrates powerful reasoning abilities in creation and summarization, scoring above 80 points, but it falls short in basic image search and regular writing.

It is worth noting that the Lanxin large model currently installed on the vivo X100 series phones is a dual-use model specially designed for end-side scenarios and cloud-side scenarios for mobile phones, far from being as "large" as models with hundreds of billions of parameters. However, when I threw an article about whether the emergence phenomenon of large models really exists to it, it accurately found the most crucial and core argument inside: the emergence ability of large models is a measure chosen by researchers rather than a result of model capacity expansion, so it is not a true "intelligent emergence."

This really shocked me. Because "reading documents" is a somewhat difficult task for large language models, and not every model can read well. For example, ChatGPT's recent ability to read complex and lengthy PDF files has surprisingly degraded, especially in terms of summarization. But the Lanxin large model's Lanxin Xiaov was able to find the most crucial argument at once. It is worth mentioning that during testing, I specifically selected the "local summarization" function, completely using the computing power (MediaTek Dimensity 9300) and reasoning ability of this vivo X100 machine to summarize. In a sense, it broke the inherent understanding that "large models must be large."

Next, I discovered an even more interesting phenomenon: when you upload a longer paper to the Lanxin large model, it can still extract the most crucial and important viewpoints and findings, but the extended narrative often ends hastily with a few sentences, "reading without understanding." It forms a contrast with some other large model Chat Bots in terms of reading comprehension abilities: many models have strong information disassembly abilities but insufficient summarization abilities. The Lanxin large model is extremely precise in summarization and extraction, but it is unwilling to disassemble and read in detail, unwilling to waste tokens on explaining issues, which should be closely related to the model's size.

In terms of local photo search and image retrieval on the phone, the Lanxin large model's response speed is extremely smooth, for example, finding all "photos of the Forbidden City" stored locally on the phone within one second. In writing travel guides and other aspects, its performance is average. In image creation, its drawings of the Forbidden City, beef noodles, and spicy hot pot can approach the level of DALL-E by ChatGPT, but it is not as rich in imagination as ChatGPT and cannot produce particularly wild and imaginative images. However, when I asked it to draw "an AI deeply contemplating the future of humanity," it actually gave me an image with such an artistic conception.

Furthermore, the Lanxin large model's ability to control apps through natural dialogue is remarkable. When I tell it that I want to order takeaway for spicy hot pot, it will tell you that the Meituan app is not installed on this new phone, and when you agree to install it, it will automatically download the Meituan app from the app store. Then, it will open the page with "spicy hot pot." Of course, you might think that Apple's Siri can do this too, because as a system-level assistant app, it is easy to access the permissions on this phone. But the difference is that Siri can only accept very clear instructions to open which app, and it is powerless in the face of a general natural language demand. It is an embedded intelligent voice module, but with the support of the Lanxin large model, Lanxin Xiaov has become a Copilot with natural language understanding capabilities.

In short, after training several key functions, you will have a relatively confident direction and judgment, that is, the matter of end-side large models is feasible and reliable. Moreover, the end-side large model, and even the entire large language model landing in thousands of households and the general public, may still depend on smartphone manufacturers, whether you like it or not.

In a sense, adapting large models to smartphones is actually closer to what Microsoft recently emphasized as "small language models." Its parameters usually cannot exceed 10 billion, otherwise, the phone's memory cannot handle it, which also means that it can only be trained for specific aspects, or train a model to a certain output level, and then stop. For the vast majority of people, this is sufficient. Mistral AI, a Paris-based startup that has recently gained popularity, is such a small model company.

Based on the parameters announced by the Lanxin large model, a cloud-based large model with 170 billion parameters is used for distillation training to obtain a model with 7 billion parameters, and the calculation and reasoning are performed on both the cloud side and the end side, while a model with 1 billion parameters performs calculation and reasoning only on the end side. This is also what Qualcomm, MediaTek, Intel, and AMD are constantly trying and tinkering with to break free from the curse of NVIDIA. If the model is not embedded in smartphones and PCs, they have no chance. However, the models that can be embedded in smartphones and PCs are often not large enough, they are small models.

Small models have their advantages: they focus only on doing a few things well, do not produce lengthy information and code outputs, have a few highlights, and are generally average in other aspects. For example, Mistral AI's code is stronger than ChatGPT. Similarly, the Lanxin large model is more precise in summarizing and processing local documents than other models, making it more efficient to manage personal documents and schedules on the phone. Other aspects such as drawing, writing, and searching are also there, but not outstanding. But so what?

In the current discussion of the future of generative artificial intelligence in China, there is a strange phenomenon: those who are highly praised and hyped do not land, and those who live on the ground have no interest in AI. Most people have not used ChatGPT, and for Wenxin Yiyuan, Tongyi Qianwen, and ChatGLM, they may have only heard of them and occasionally used them, and cannot see any essential changes these things bring to themselves. And those who are obsessed with large model parameters, scale, and benchmark evaluation results, all their achievements are posted on Hugging Face and GitHub, and they almost never promote them to ordinary people, nor do they have any interest in ordinary people. The situation of AI developers and users being indifferent to each other may be difficult to change in the short term.

But if smartphone manufacturers make large language models, it may be different. The main reason is: users are interested. When the large model is embedded in the underlying operating system and can be called up, assisted, and invoked at any time, just like the Lanxin large model is embedded in Origin OS4, users will involuntarily need it, need its assistance, test its potential, and even need its companionship. It may not be a universal large model, it may be a small model, but it understands its users, is familiar with the data in the device, understands user habits, protects user privacy, can help schedule, open takeaway menus, summarize documents, select photos, and complete some basic writing. It is the "sufficient" and "trustworthy" AI for most people.

Promoting the popularization of large language models is definitely not achieved through AI programming, nor is it only through technical breakthroughs that refresh SOTA evaluations to benefit the majority of humanity. Just like whether a pair of shoes fits, you only know when you put them on, whether a model is suitable, you only know when you use it. Recently, I have been consciously "de-ChatGPT-izing": reading papers and documents with Kimi Chat, desk work with Wenxin Yiyuan and ChatGLM, and personal assistant with vivo Lanxin large model, not for anything else, but because it "fits." You don't expect it to surpass ChatGPT comprehensively, but I really need a "large model" or "small model" that can be used on a phone, protects personal privacy and data security, and has an average score in various aspects.

Large language models are meant for people to use, not to show off.

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

Bybit: $50注册体验金,$30000储值体验金
Ad
Share To
APP

X

Telegram

Facebook

Reddit

CopyLink