Openai, the company, which was in the global history of success in the field of artificial intelligence (AI), recently attracted the attention of the world media due to scandalous dismissal and the subsequent restoration of its General Director Sam Altman.
A lot of questions arose around this situation, and some compare events in the company with a script from series like “Game of Thrones”. One of the reasons for the scandal could be Altman’s attention to other projects, in particular to Worldcoin.
However, another theory associated with the letter “Q” is of the greatest interest. According to unofficial sources, the chief technical director of the Openai world Murati pointed out an important discovery known as “Q Star” or “Q*”, as the main reason for the conflict, which took place without the participation of the chairman of the board of Directors of Greg Brockman. In protest, Brockman left the company.
“Q*” can relate to two different theories in the field of AI: to Q-teaching or algorithm Q* from Maryland Systems of evidence of denial (MRPPS).
Q -teaching – a method of learning with reinforcement, where AI studies on the basis of the method of trial and error. This approach allows AI to independently find optimal solutions, not relying on human intervention, in contrast to the current approach of Openai, known as learning with feedback from humans (RLHF).
Even in May OpenAi published article, which was said that they “taught the model to achieve a new level in solving mathematical tasks, rewarding each correct step in reasoning, and not just rewarding the correct final answer.” If they used Q-Learning or a similar method to achieve a goal, this would open a completely new set of problems and situations that Chatgpt could solve in a natural way.
Theory 2: Q* algorithm from mrpps
The Q* algorithm is part of the MRPPS system and is a complex method for evidence of theorems in AI, especially in answers to questions. This algorithm combines semantic and syntactic information to solve complex problems.
If “Q” is associated with the Q* algorithm from MRPPS, this may mean significant progress in deductive abilities and solving problems in AI
Thus, while Q-teaching is aimed at teaching AI to learn based on interaction with the environment, the Q algorithm is more aimed at improving the deductive abilities of AI. Understanding these differences is the key to the awareness of the potential consequences of “Q” Openai. Both have huge potential in the development of AI, but their application and consequences differ significantly.
Of course, all these are just assumptions, because Openai did not explain the concept and did not even confirm and did not refute rumors that q* – whatever it is – actually exists.