diff options
Diffstat (limited to 'docs/devel')
| -rw-r--r-- | docs/devel/code-provenance.rst | 55 |
1 files changed, 54 insertions, 1 deletions
diff --git a/docs/devel/code-provenance.rst b/docs/devel/code-provenance.rst index c25afed98d..b5aae2e253 100644 --- a/docs/devel/code-provenance.rst +++ b/docs/devel/code-provenance.rst @@ -282,4 +282,57 @@ boilerplate code template which is then filled in to produce the final patch. The output of such a tool would still be considered the "preferred format", since it is intended to be a foundation for further human authored changes. Such tools are acceptable to use, provided there is clearly defined copyright -and licensing for their output. +and licensing for their output. Note in particular the caveats applying to AI +content generators below. + +Use of AI content generators +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +TL;DR: + + **Current QEMU project policy is to DECLINE any contributions which are + believed to include or derive from AI generated content. This includes + ChatGPT, Claude, Copilot, Llama and similar tools.** + +The increasing prevalence of AI-assisted software development results in a +number of difficult legal questions and risks for software projects, including +QEMU. Of particular concern is content generated by `Large Language Models +<https://en.wikipedia.org/wiki/Large_language_model>`__ (LLMs). + +The QEMU community requires that contributors certify their patch submissions +are made in accordance with the rules of the `Developer's Certificate of +Origin (DCO) <dco>`. + +To satisfy the DCO, the patch contributor has to fully understand the +copyright and license status of content they are contributing to QEMU. With AI +content generators, the copyright and license status of the output is +ill-defined with no generally accepted, settled legal foundation. + +Where the training material is known, it is common for it to include large +volumes of material under restrictive licensing/copyright terms. Even where +the training material is all known to be under open source licenses, it is +likely to be under a variety of terms, not all of which will be compatible +with QEMU's licensing requirements. + +How contributors could comply with DCO terms (b) or (c) for the output of AI +content generators commonly available today is unclear. The QEMU project is +not willing or able to accept the legal risks of non-compliance. + +The QEMU project thus requires that contributors refrain from using AI content +generators on patches intended to be submitted to the project, and will +decline any contribution if use of AI is either known or suspected. + +This policy does not apply to other uses of AI, such as researching APIs or +algorithms, static analysis, or debugging, provided their output is not to be +included in contributions. + +Examples of tools impacted by this policy includes GitHub's CoPilot, OpenAI's +ChatGPT, Anthropic's Claude, and Meta's Code Llama, and code/content +generation agents which are built on top of such tools. + +This policy may evolve as AI tools mature and the legal situation is +clarifed. In the meanwhile, requests for exceptions to this policy will be +evaluated by the QEMU project on a case by case basis. To be granted an +exception, a contributor will need to demonstrate clarity of the license and +copyright status for the tool's output in relation to its training model and +code, to the satisfaction of the project maintainers. |