State of race in Linux kernel garbage collector, capable of improving privileges

Yann Horn (Jann Horn) from Google Project Zero, at one time revealing Spectre and Meltdown vulnerabilities, published Operating technology vulnerabilities ( CVE-2021-4083 ) in the Linux kernel garbage collector. The vulnerability is caused by the status of the race when cleaning the file descriptors of UNIX sockets and potentially allows the local unprivileged user to achieve its code at the kernel level.

The problem is interesting in that the temporary window, during which the state of the race is manifested, was evaluated as too small to create real exploits, but the author of the study showed that even similarly skeptically considered vulnerabilities can become a source of real attacks, if the creator of the exploit has Necessary skills and time. Yann Horn showed how with the help of filigined manipulations, you can reduce the status of the race, which occurs while calling the functions close () and fget (), to the fully operated USE class vulnerability and achieve an appeal to the already released data structure inside the kernel.

The status of the race occurs during the closing process of the file descriptor while simultaneously calling the functions close () and fget (). Call close () can work before performing FGET (), which will introduce a garbage collector into confusion as in accordance with the refcount meter, the File structure will not be external links, but it will remain attached to the file descriptor, i.e. The garbage collector calculates that it has exclusive access to the structure, but in fact a short period of time remaining in the table descriptors table will still indicate the liberated structure.

To increase the likelihood of hitting the status of the race, several tricks are used, which allowed us to bring the likelihood of success of operation to 30% when making specific optimization system. For example, to increase the time of appeal to the structure with file descriptors, there are not several hundred nanoseconds to displace data from the processor cache through the cache toastive activity on the other CPU core, which made it possible to achieve the recovery of the structure from memory, and not from the fast CPU cache.

The second important feature was to use to increase the time of the interrupts racing, generated by a hardware timer. This moment was selected so that the interrupt handler was triggered during the emergence of the status of the race and for some time interrupted the execution of the code. For additional tightening of the return return using EPOLL, about 50 thousand records in Waitqueue were generated, requiring bustling in the interrupt handler.

/Media reports.