UCLA researchers discovered GPT-3’s 80% accuracy in solving reasoning problems, surpassing humans’ 60%
According to a recent study, ChatGPT demonstrates problem-solving abilities that match or exceed those of undergraduate students. The researchers from the University of California, Los Angeles found that GPT-3, the underlying model of the chatbot, performed at a level comparable to US college undergraduates in solving reasoning problems similar to those found in intelligence tests or exams like the SAT. The psychologists converted complex arrays of shapes into a text format that the model could process and ensured that the questions were entirely new to the model before testing its ability to predict the next image.
When presented with the same problems, 40 UCLA undergraduates and GPT-3 were tested. The researchers observed that GPT-3 correctly solved 80% of the problems, significantly surpassing the human participants’ average score of just below 60%.
In their study, the researchers also gave the model SAT “analogy” questions, pairs of words linked in some way, which they believed were not available on the internet and, therefore, not present in its extensive training data. The AI’s performance was then compared with college applicants’ SAT scores, and it was found that the AI outperformed the average human score.
However, in another test, the researchers asked both the model and the student volunteers to match a passage of prose with a different short story conveying the same meaning. In this particular test, GPT-3 did not perform as well as the students. Nevertheless, the research indicated that GPT-4, the improved successor to GPT-3, outperformed its predecessor in this particular task, as mentioned in the Nature Human Behaviour journal publication.
According to the study, GPT-3 demonstrated a remarkably strong ability to recognize patterns and infer relationships, often reaching or exceeding human capabilities in various scenarios.
The lead author of the study, Taylor Webb, clarified that the model behind ChatGPT has not reached the level of artificial general intelligence or human-level intelligence. It faced challenges with tasks involving social interactions, mathematical reasoning, and problems requiring an understanding of physical space, like determining the best tools for transferring sweets between bowls. Nevertheless, the technology has shown significant progress.
Webb, a postdoctoral researcher in psychology at UCLA, emphasized that GPT-3 is not yet fully equipped with general human-level intelligence, but it has undoubtedly shown progress in a specific domain.
The UCLA researchers acknowledged that due to limited access to the internal mechanisms of GPT-3, developed by OpenAI in San Francisco, they were unable to ascertain how the model’s reasoning abilities function and whether it resembles human thinking or presents a novel form of intelligence.
Keith Holyoak, a psychology professor at UCLA, remarked that GPT-3 might exhibit thinking patterns similar to humans, but it diverges significantly in its learning process since humans do not acquire knowledge by ingesting the entire internet. The researchers are eager to determine whether the model operates similarly to humans or represents a genuinely new form of artificial intelligence, which would be remarkable in its own right.