SUSE announced about the opening under the Apache 2.0 license Language model caavil-qwen3-4b used in SUSE and Opensuse in tools cavil for analysis of the licensed frequency of code. The published model covers 4 billion parameters and is based on the model qwen3-4b , additionally optimized for text classification.
The main purpose of the model is the definition of licenses used in the source code programs and documentation. To fulfill this task, the model is additionally trained in a data set that includes 150 thousand examples of headlines and comments with the mention of licenses in the initial Code. In practice, the model allows you to automate the verification of licensed cleanliness of the code base to identify licensing incompatible and potential legal problems with the code.
The size of the model is selected to achieve a combination of a qualitative understanding of language structures and the possibility of execution on systems with typical consumer GPUs. In addition to the model itself in the free access is posted data set used in learning and a tool for validation. A processor is also available for using the model in tools cavil designed to verify the source code for compliance with legal norms and requirements (checking licenses, identifying licenses, risk assessment).