89% of students use ChatGPT for homework, and a Chinese guy developed an AI tool for detecting counterfeit goods.

Image Source: Generated by Wujie AI

With the popularization of various large models, AI has become a cheating tool for many students. According to reports, as many as 89% of American students admit to using ChatGPT to complete homework, 48% of students use it for tests, 53% of students use it to write papers, and 22% use it to generate paper outlines.

In order to prevent the abuse of AI, a Chinese guy has developed an AI fraud detection tool called GPTZero, specifically to identify whether the text comes from AI or humans.

This Chinese guy is named Edward Tian, a senior at Princeton University. During his college years, he majored in computer science and minored in journalism. As ChatGPT became popular, he increasingly felt that "when the text is not written by humans, humans should have the right to know." Therefore, he stayed up late during the winter break to code and develop GPTZero.

GPTZero was widely welcomed by teachers as soon as it was released, with over 30,000 people trying it out within a week. Subsequently, GPTZero was updated to be able to identify "human + AI" mixed text and support batch file import, with its traffic reaching 400,000 times within a month.

As a result, this guy also became a public enemy of students, and some even called him a "snitch's lackey."

According to the official website, "GPTZero is the gold standard for detecting artificial intelligence and can detect ChatGPT, GPT4, Bard, LLaMa, and other artificial intelligence models after training."

Specific Usage:

Step 1: Log in to the GPTZero official website, no registration, no magic, and it's free!

(https://gptzero.me/)

Step 2: Copy and paste 250-5000 characters of text content into the dialog box (pay for more than 5000 characters), and click "Check Origin" to check the result; or click "Upload file" to upload a file for detection, supporting formats include pdf, doc, docx, txt.

It is worth noting that the free version of GPTZero only supports text content of no more than 5000 characters. If you want to conduct plagiarism scans, batch process more files, etc., you need to pay.

Step 3: GPTZero analyzes the text and provides a detection score. The higher the detection score, the more likely the text is generated by AI. At the same time, the highlighted sentences are those that GPTZero thinks may have been generated by AI.

OK, let's evaluate how well it works.

Evaluating GPTZero: Excellent Performance!

Round1: Successfully Identifying Content Generated by GPT-4 and Bard

The editor first used GPT-4 to generate a long paragraph of text, and then fed it to GPTZero:

GPTZero believes that there is a 99% probability that this article was written by AI. At the same time, it also highlighted the sentences that may have been written by AI. According to GPTZero's detection, all 8 sentences in the text were generated by AI.

In this round, GPTZero detected successfully!

To prevent errors, the editor also tested it with Bard.

"The Apple was downgraded twice in a week, and its market value has evaporated by $165.1 billion in the first three trading days of 2024" This news has attracted attention. The editor first let Bard generate a news article, and then copied and pasted it into GPTZero.

GPTZero believes that there is a 91% probability that it was written by AI, and highlighted the entire article to indicate that these 15 sentences may all have been generated by AI.

GPTZero answered correctly again.

It's so good at recognizing English, but what about Chinese?

The editor generated a piece of content using Wenxin Yiyuan and copied it to GPTZero:

This time, GPTZero "failed." The text was clearly written by AI, but GPTZero thought that 4 out of 7 sentences might have been generated by AI, and were co-written by humans and AI.

It seems that GPTZero has the upper hand in English recognition.

Round2: Identifying Content Written by Humans, Occasionally Failing

The editor found a headline news from The Washington Post, "Trump is promising to reduce inflation. His plans may reignite it," and selected the first 2 paragraphs:

After GPTZero's detection, it concluded: This article was written by a person, with a 0% probability of being written by AI, and the answer was correct.

The editor also found a news report on the BBC official website: "US budget: Spending deal reached as shutdown deadline looms".

When the editor pasted the entire text into GPTZero, GPTZero indicated that the article was written by humans, with all 24 sentences being human-written. This answer is correct.

However, when the editor pasted the last 5 paragraphs of this news article for detection, the result indicated that it was co-written by humans and AI, with 3 out of 5 sentences being AI-generated. This is nonsense.

What's the Magic Behind GPTZero?

GPTZero claims an accuracy of 85% for AI text and a high accuracy of 99% for human text. How does it achieve this? It relies on two key indicators: "Perplexity" and "Burstiness".

"Perplexity" refers to the randomness of sentences in the text. Compared to complex human expressions, AI that has received extensive text training has formed a text generation paradigm. When GPTZero receives text it is not familiar with, it becomes "perplexed".

For text of several hundred words in length, GPTZero calculates the "total perplexity of the text", "average perplexity of all sentences", and "perplexity of each sentence" to calculate a comprehensive score. When this score is greater than 85, the text is likely to be human-written.

Another key indicator is burstiness, which measures the sudden appearance of a sentence or word in the text, reflecting changes in the length and structure of the text.

Humans tend to have a more dynamic writing style, resulting in relatively varied text structures. AI, on the other hand, tends to use a more consistent structure to generate text. Additionally, large models use the same rules to predict the next word, resulting in low burstiness.

However, GPTZero can also "make mistakes".

For example, when a netizen threw the US Constitution at GPTZero, it actually claimed that the US Constitution was generated by AI.

Many students have also suffered, as their hard-written papers were identified by GPTZero as AI-generated, causing them great distress.

GPTZero developer Tian also admits that GPTZero is not 100% accurate and may produce false positives or false negatives, as the perplexity and burstiness indicators are difficult to capture the complexity and style of human or AI writing.

Other Similar Applications

In addition to GPTZero, there are other similar AI detection tools on the market.

1. AI Text Classifier

OpenAI personally developed a text detection application called AI Text Classifier. However, due to its unreliability for short texts (less than 1000 characters), and occasional mislabeling even for longer texts, it was discontinued on July 20, 2023, due to its low accuracy.

2. AI Content Detector

Link:

https://writer.com/ai-content-detector/

Similar to the aforementioned detection tool, users can provide text for content analysis. Additionally, it supports scanning entire web pages via URL, but it only supports up to 1500 words at a time and can be used as a free detector.

3. Copyleaks AI Content Detector

Link:

https://copyleaks.com/

Copyleaks can differentiate between AI-generated content and human-written content, with an accuracy of over 99%, and supports multiple languages. Users can use it to check different types of content, such as articles, posts, academic papers, and comments, to ensure the originality of the content.

4. Winston

Link:

https://gowinston.ai/

Winston AI is an AI content detection tool that can help check content generated by ChatGPT, GPT-4, Bard, Bing Chat, Claude, and more large language models, with an official claim of 99.6% accuracy, and it is free for up to 2000 characters, suitable for writers, educators, and online publishers.

5. Content at Scale

Link:

https://contentatscale.ai/ai-content-detector/

Content at Scale, launched in September 2022, is the world's fastest-growing AI writing platform for SEO marketers. It can detect content from ChatGPT, GPT-4, and Bard.

GPTZero link: https://gptzero.me/

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

89% of students use ChatGPT for homework, and a Chinese guy developed an AI tool for detecting counterfeit goods.

Round1: Successfully Identifying Content Generated by GPT-4 and Bard

Round2: Identifying Content Written by Humans, Occasionally Failing

Selected Articles by 巴比特

Table of Contents

Related Articles