Facebook published Hermit, tools for repeated execution of programs

Facebook (banned in the Russian Federation) published code Instrumentation HERMIT , which forms the environment for the determinated execution of programs, allowing different launches to achieve an invariable result and repeat the progress of execution when using the same input data. The project code is written in Rust and spreads under the license BSD.

During the usual execution of the result, a variety of extraneous factors, such as the current time, features of planning flows, addresses of virtual memory, data from the generator of pseudo -liable numbers and various unique identifiers. Hermit allows you to start the program in a container in which these factors remain constant during subsequent launches. Repeated execution, in which inconsistent environment parameters are completely reproduced, can be used to diagnose errors, multi-stage debugging with repeated launches, create a fixed environment for regression tests, stress testing, identifying problems with multi-flora and in repeated assemblies.



The reproducible environment is created through the interception of system calls, some of which are replaced by their own handlers that give out a constant result, and some are redirected to the core, after which the result is cleaned of unstable data. To intercept the system calls, the framework is used reverie , whose code is also published by Facebook. To prevent the impact on the course of the execution of changes in the file system and network queries, the execution is carried out using a fixed image of the FS and with disconnecting access to external networks. When accessing the generator, the pseudo -random numbers Hermit gives a predetermined sequence repeating at each start.

From the most difficult inconsistent influences on the course of execution, a streaming planner is distinguished, the behavior of which depends on many external factors, such as the number of CPU nuclei and the presence of other flows performed. To ensure the repeatability of the planner’s behavior, all streams are performed in a series in binding only to one CPU core and while maintaining the procedure for transmitting flows. Each stream is allowed to perform a fixed number of instructions, after which the execution is stopped and transmitted to another stream (the CPU PMU (Performance Monitoring Unit) block is used to limit), which stops execution after a given number of conditional branches).

To diagnose problems with flows due to the occurrence of the condition of the race in HERMIT, there is a regime for identifying operations, the procedure for which was impaired and led to an emergency completion of work. To identify such problems, a comparison of states is carried out in which the correct work and emergency completion of the execution were recorded.

/Media reports cited above.