New Study: ChatGPT is Getting Dumber
- By Paul Mah
- August 08, 2023
ChatGPT usage appears to be losing some momentum, even as users complain about a dip in output quality from the generative AI chatbot that has taken the world by storm.
After months of dizzying growth, traffic dropped by almost 10% last month compared to the previous month, according to SimilarWeb.
The dip comes at a time when its paid users were complaining that the output of its top-end GPT-4 is getting worse, even if they agree that it is generating output faster than before. GPT-4 is only available to paid users.
Complaints could be traced back to May when Roblox product lead Peter Yang wrote on Twitter that “the writing quality has gone down in my opinion”.
As reported by Business Insider, another user, Christi Kennedy, wrote on an OpenAI developer forum that: "If you aren't actually pushing it with what it could do previously, you wouldn't notice. Yet if you are really using it fully, you see it is obviously much dumber."
This is perplexing as generative AI models are supposed to get smarter by leveraging a continuous stream of user input to train and improve.
A possible explanation would be that OpenAI is creating several smaller GPT-4 models that act like the original, large GPT-4 model, but are less expensive to run. There might be models specializing in different areas, with a coordinator sending queries to the right model.
The technical reason for this? Better responses and cheaper, faster, responses. This distributed approach isn’t a new idea and does dovetail with purported “leaks” that GPT-4 is actually made of multiple smaller models that circulated a couple of months ago.
For now, the poorer performance appears to be confirmed with a new research paper that tracked the performance of GPT-3.5 and GPT-4 on a predetermined list of tasks ranging from math problems, coding, and knowledge-intensive questions.
In the report “How is ChatGPT’s behavior changing over time” published last week, the authors evaluated the behavior of the March 2023 and June 2023 versions of GPT-3.5 and GPT-4.
“We find that the performance and behavior of both GPT-3.5 and GPT-4 varied significantly across these two releases and that their performance on some tasks have gotten substantially worse over time, while they have improved on other problems.”
The authors concluded that their findings demonstrate that the behavior of both GPT-3.5 and GPT-4 has varied significantly over a relatively short period of time. They recommend that users or companies that rely on LLM services for their workflow implement monitoring analysis to monitor its behavior over time.
The paper can be accessed here (pdf).
Paul Mah is the editor of DSAITrends. A former system administrator, programmer, and IT lecturer, he enjoys writing both code and prose. You can reach him at [email protected].
Image credit: iStockphoto/francescoch