What is AI drift and why does it make ChatGPT less intelligent?

What is AI drift and why does it make ChatGPT less intelligent?

chatGPT Whether you have experienced it yourself using ChatGPT or read about it, the rumors are true, ChatGPT is getting progressively dumber.

This phenomenon is especially perplexing because generative AI models use user input to continuously train themselves, which should make them more intelligent as they accumulate more user entries over time.

The answer may lie in a concept called “drift.”

A “drift” refers to when large language models (LLMs) behave in unexpected or unpredictable ways that stray away from the original parameters. This may happen because attempts to improve parts of complicated AI models cause other parts to perform worse.

Researchers from the University of California at Berkeley and Stanford University conducted a study to evaluate drifts and examine how ChatGPT’s popular large language models (LLMs), GPT 3.5 (the LLM behind ChatGPT) and GPT-4 (the LLM behind Bing Chat and ChatGPT Plus) changed over time.

The study compared the ability of both LLMs to solve math problems, answer sensitive questions, answer opinion surveys, answer multi-hop knowledge-intensive questions, perform code generation, US Medical License exams, and complete visual reasoning tasks in March and June.

As seen by the study results above, GPT-4’s March version outperformed the June version in many instances, with the most glaring being basic math prompts where the March version of GPT-4 outperformed the June version in both examples (a) and (b).

GPT-4 also worsened at code generation, answering medical exam questions, and answering opinion surveys. All of these instances can be attributed to the drift phenomenon.

Regarding the drifts, one of the researchers, James Zou told the Wall Street Journal, “We had the suspicion it could happen here, but we were very surprised at how fast the drift is happening.”

Despite the deteriorating intelligence, there were also some instances of improvement in both GPT-4 and GPT-3.5.

As a result, the researchers encourage users to keep using LLMs but to have caution when using them and constantly evaluate them.

chatGPT

What’s happening to ChatGPT’s Intelligence?

Whether you’ve personally encountered the issue or heard about it from others, the rumors surrounding ChatGPT’s deteriorating intelligence are indeed true. It’s a puzzling situation considering the nature of generative AI models that continuously train themselves using user input, which should theoretically make them smarter with more usage over time. However, there’s a possible explanation for this phenomenon known as “drift.”

What is “drift”? It refers to when large language models (LLMs), such as ChatGPT’s GPT 3.5 and GPT-4, behave unexpectedly or unpredictably, deviating from their original parameters. The complexity of AI models leads to a delicate balance, where attempts to improve certain aspects may inadvertently cause other parts to decline.

To better understand these drifts, researchers from the University of California, Berkeley, and Stanford University conducted a study. They aimed to evaluate the drifts affecting ChatGPT’s LLMs over time. Specifically, they examined GPT 3.5, the LLM supporting ChatGPT, and GPT-4, which powers Bing Chat and ChatGPT Plus.

The study involved comparing the two LLMs’ capabilities in various tasks, ranging from solving math problems to answering opinion surveys. The researchers conducted the assessments in both March and June to observe any changes.

The results of the study highlight a decline in the intelligence of GPT-4 from its March version to its June version. The most significant differences were observed in basic math prompts, where the March version outperformed the June version in both examples (a) and (b). Moreover, GPT-4’s performance worsened in code generation, answering medical exam questions, and responding to opinion surveys. All these instances can be attributed to the drift phenomenon.

James Zou, one of the researchers involved in the study, expressed his surprise at the rapid pace of the drift, stating, “We had the suspicion it could happen here, but we were very surprised at how fast the drift is happening.” This accelerated decline in intelligence raises concerns for users relying on AI language models for various tasks.

While the findings highlight the challenges posed by drift, it’s worth noting that there were also instances of improvement in both GPT-4 and GPT-3.5. This indicates that the models’ performance is not entirely stagnant and can still exhibit positive changes.

Nonetheless, users are advised to exercise caution when utilizing LLMs like ChatGPT, acknowledging their limitations, and consistently evaluating their outputs. AI language models offer great potential, but they require ongoing monitoring and fine-tuning to ensure optimal performance. So, despite the drifting intelligence, the researchers encourage users to continue using LLMs, armed with the knowledge to make informed decisions about their outputs.