In 2022, OpenAI launched the ChatGPT chatbot based on GPT-3.5. It has been called the "Google killer" and has received a lot of criticism for the fact that the algorithm generates good-sounding text, but is not able to analyze it well.
Finally, on March 15, a new generation of the algorithm, GPT-4, was released. According to OpenAI, the neural network surpasses its previous version by 40% in information reliability, and its propensity for prohibited content is reduced by 82%: the model was taught ethics and value guidelines for six months.
Microsoft has stated that their Bing browser is already running on GPT-4. OpenAI indicates that the technology is also being used by Duolingo, Be My Eyes, Stripe, Morgan Stanley, Khan Academy, and the Icelandic government.
Multimodality and hallucinations
The main difference between GPT-4 and GPT-3.5 is multimodality, that is, the ability to process texts and images with one model. The previous version only worked with text. In addition, in the new model, the volume of texts for processing has been increased by more than eight times.
“Now, instead of describing a process or a query in GPT-4, you can upload a picture and get a quick result,” says Netology teacher Artem Chistyakov.
The examples presented by the authors of GPT-4 show "an impressive quality of understanding images and answering questions," says Andrei Savchenko, director of science at the Sberbank Artificial Intelligence Laboratory. Requests include text and an explanatory picture - graphs, flowcharts of algorithms (including hand-drawn ones). In addition, GPT-4 better solves standard tasks, including finding answers to questions, conducting a dialogue, including clarifying the context of previous questions and answers, summarizing text, creating artistic texts, generating program code from a text description, including creating websites and mobile applications.
GPT-4 is able to describe what is shown in the illustration and even explain the meaning of "seen", including the symbolism and humor in memes. “For example, in the presentation of GPT-4, AI explained the meaning of visual jokes (for example, an iPhone with an ancient VGA cable connected to it ). Moreover, after analyzing a photo of the contents of an open refrigerator, he told what dishes can be prepared from it, ”says Nikolay Sedashov, managing partner of the Spektr analytical agency. According to him, in the near future, AI will be able to “feed” large presentations with infographics and illustrations, and at the output receive a generalized retelling with basic numbers and data. “This function has at least one more important area of application - helping people with visual impairments,” the expert adds.
In addition, GPT-4 is more creative, reliable for solving difficult and specific tasks, and able to process more complex instructions than previous versions, says Andrey Kuznetsov, executive director of data research at Sber AI. Thus, the model is able to solve problems in which it is necessary to work with text at the character level, which is considered difficult for language models. She can make a summary of a long text, all words in which will begin with the same letter. GPT-4 also successfully copes with the processing of structured data, is able to highlight important information and discard unnecessary text, adds Kuznetsov.
The general development of generative neural networks and services based on them will greatly change the labor market, Alexander Gorny joins. Many jobs may not be needed: “One person using AI can do the work of three,” says Gorny. At the same time, the release of GPT-4 itself, in his opinion, will not affect the labor market in any way: “There will be no such person who will be told: “That's it, GPT-4 has appeared, now nobody needs you.” On the contrary, competing projects may hire hundreds of engineers to quickly catch up with OpenAI,” says Gorny. “Whether there will be a wave of cuts, no one can know for sure. At a first approximation, it seems that yes.
On the other hand, generative AI can influence the creation of new products and services that were not previously available to humans and, therefore, provide new jobs for their development,” says Semyon Budyonny, head of the Design of New Materials group at the AIRI Institute. According to him, a far-sighted approach in a changing world does not involve a total “drying” of the state through automation, but the cultivation of a lifelong learning approach. The sphere of education will change the most, Andrei Savchenko believes.
The authors of GPT-4 placed special emphasis on passing certification tests and exams for a large number of areas of knowledge - chemistry, biology, mathematics, history, medicine, and so on. “Online services based on ChatGPT are already appearing, which allow you to peep the answer to a test question,” says Savchenko. Here a serious challenge arises that will force us to reconsider the system of education and knowledge control, as happened when the Internet appeared: educational institutions partially abandoned cramming in favor of teaching skills and competencies.
The question of security also remains open, she said: “Will a large company load the core code of their system's internal products into the model? Formally, the rules of the service allow the use of all transmitted data for training - how does this compare with the concept of a trade secret? The neural network is also not yet able to replace lawyers, says Moscow Digital School teacher Roman Yankovsky. “For example, the neural network does not understand the jurisdiction and may give an irrelevant answer related to the wrong country. In addition, access to many documents is limited, which complicates both the training and the use of the neural network, ”he explains.