International Researchers Team created a system , capable of independently conducting scientific experiments. This “AI-scientist”, as the developers christened it, demonstrates the level of knowledge and skills comparable to a novice graduate student.
Kong Lou from the University of British Colombia, who headed the project, spoke about the unexpected results of the experiment. According to him, the system showed amazing creativity in the generation of scientific hypotheses. However, like that of a young researcher, most ideas turned out to be non -viable. The developers are faced with a number of problems when creating a model. AI experienced difficulties with writing coherent scientific articles and sometimes incorrectly interpreted the results.
Particular concern caused a tendency to the system to “hallucinations” – the generation of false information. Despite clear instructions, use only verified data, AI still invented facts. Researchers estimated the frequency of such cases in less than 10%, but even this indicator is considered unacceptable for scientific work.
The project has combined the efforts of academicians and specialists from the Tokyo startup Sakana AI. The team published preliminary results of the study on the ARXIV server. In the article, they called their creation “The beginning of a new era of scientific discoveries” and “The first integrated system for fully automated scientific research”.
The idea of using AI for scientific research is not new: it originates in 2020, when Google DeepMind introduced Alphafold – a system that struck biologists with its ability to predict 3D -structures of proteins with unprecedented accuracy. Since then, many large corporations have been picked up by the trend.
The researchers checked the capabilities of their system in the field of computer sciences. AI was studying large language models underlying chat bots like Chatgpt, as well as diffusion models used in images like Dall-E.
The process of the AI-scientist includes several stages. First, the system generates hypotheses, evaluating them according to the criteria of interestingness, novelty and feasibility. Then she checks the originality of ideas through the Semantic Scholar database. After that, AI uses Aider programming assistant to conduct experiments and conduct a journal of results. Based on the data obtained, the system can generate ideas for subsequent experiments, which allows it to develop a study in the right direction.
At the next stage, the model writes a scientific article, following a template based on the requirements of scientific conferences. Due to the complexity of creating a whole nine-page text, the researchers broke the process into many steps. The program writes one section at a time, checking them for the presence of repetitions and contradictions. Then he addresses Semantic Scholar again to search for quotes and compile a bibliography.