Cloudflare Outage: 25-Minute Crash Blamed on Lua Code

Yesterday, the Cloudflare content delivery network, responsible for about 20% of all global web traffic, experienced partial unavailability for 25 minutes, following a global outage two weeks earlier. During the incident, approximately one-third of requests through Cloudflare resulted in a blank page with error code 500. The root cause of this outage was identified as a long-standing issue in the Lua code utilized in the WAF (Web Application Firewall) traffic filtering system.


In response to a critical vulnerability (CVE-2025-55182) in the server components of the React framework, Cloudflare engineers had implemented protection at the WAF level to safeguard client systems after the exploit became publicly available. However, the implementation encountered issues as the buffer size for checking traffic on proxy servers was increased, causing compatibility problems with the testing tools used for the WAF. Consequently, the decision was made to disable the testing tools using the “killswitch” subsystem.

The “killswitch” method involved quickly adjusting the configuration to deactivate specific Lua handlers on proxy servers without replacing the rules entirely. This approach, while effective in swiftly rectifying errors, led to the unintentional skipping of certain Lua code executions. The engineers overlooked the fact that the “killswitch” mode had never been employed with rules containing an “execute” call, resulting in an untested combination.

As a consequence of using the “killswitch,” the code disabled a section defining an additional test rule set. However, the call to this ruleset through “execute” remained, leading to an attempt to execute the “execute” method on an uninitialized object. This error caused the handler to crash with the message “attempt to index field ‘execute’ (a nil value)”.

/Reports, release notes, official announcements.