Linux 7.0 Bug Halves PostgreSQL Performance

An Amazon engineer has identified a regression specific to the Linux 7.0 kernel, scheduled for release on April 13. The regression was discovered when changing task scheduler settings resulted in a significant decrease in throughput and responsiveness while running the PostgreSQL DBMS on ARM64 architecture systems. Performance scores for the pgbench “simple-update” test plummeted from 98565 to 50751 under the 7.0 kernel.

The slowdown was attributed to a change in the default preemption mode in the scheduler from PREEMPT_NONE to PREEMPT_LAZY on architectures that support this mode. This change caused PostgreSQL to consume 55% of CPU time in user space by calling s_lock(). A proposal to rectify the issue suggests reverting the default mode to PREEMPT_NONE and eliminating its binding to the ARCH_NO_PREEMPT setting.

Peter Zijlstra, the individual responsible for the changes leading to the regression, and the maintainer of the task scheduler and locking-related kernel subsystems, stated that a fix must also be made to the PostgreSQL code. To mitigate the performance impact, he recommended utilizing a recently added extension in the kernel to reduce the likelihood of lock holder evictions.

The decision now falls on Linus Torvalds, who values kernel performance and user space compatibility. While the 7.0 kernel is in its final testing phase before release, reverting the scheduler settings may introduce other regressions. However, maintaining the current settings could result in a drastic reduction in the performance of a widely used DBMS.

/Reports, release notes, official announcements.