Microsoft introduced universal neural network Kosmos-1, which can solve tests for IQ and mathematical equations

microsoft introduced the neural network kosmos -1, which combines various input modes – text, audio, images and videos – and will become the basis To create universal artificial intelligence. Researchers called the system “multimodal model of a large language” (MLLM). Multimodal model is capable of:

  • Analyze images;
  • Solve visual puzzles;
  • recognize the text;
  • Pass visual tests for IQ with an accuracy of 22-26%;
  • Understand instructions in a natural language.

1-2 – Visual explanation, 3-4 – answer to the question, 5 – answer to the question of the web page, 6 – simple mathematical equation, 7-8 – recognition of numbers

Microsoft taught Kosmos-1 according to the Internet, including excerpts from The Pile (text resource in English with a volume of 800 GB) and the Common CRAWL web archive. >

After training, the researchers evaluated the abilities of Kosmos-1 in several tests, namely:

  • Understanding the language;

  • text generation;

  • Classification of the text without optical symbol recognition;

  • Generation of signatures to images;

  • Visual answers to questions;

  • answers to questions from web pages;

  • Classification of images.

It is noted that in many of these tests Kosmos-1 surpassed modern models.

    kosmos -1 also was able to correctly answer the Raven test question only in 22% of cases (with a thinner setting – in 26% of cases).

    1-2 – signatures to images, 3-6 – answers to visual requests, 7-8 – recognition of the text in the picture, 9-11 – maintaining the dialogue.

    Researchers plan to increase the size of the model, as well as integrate voice capabilities. In addition, Kosmos-1 will soon be open to developers.

    /Media reports cited above.