On 2025-10-01 at 18:59:31, Chuck Wolber wrote: > 1. It repeats what is said just a few paragraphs earlier in the document. I > understand _why_ it does this, but moving the essence of this topic up to the > DCO section avoids the repetition and avoids diluting the project's legal > guidance. > > 2. What am I supposed to do with "It's not yet clear"? This is worse than > telling me nothing. It introduces a vague question with no clear guidance. It > is _true_ that no clear guidance exists, but what are the consequences when it > _does_ exist? The worst case scenario is that we have to go back and > rework/remove AI generated patches. So why not just require something like a > declaration of AI content like the one proposed at declare-ai.org? I agree that this is unclear, which is why I suggested we be more definitive. Many of the companies that develop LLMs are headquartered in the United States. Many of the people that contribute to Git or distribute Git are not. For instance, I am located in Canada, which has different copyright laws (we have the more limited fair dealing like the UK, instead of the US's fair use) and has moral rights. It is entirely possible that the use of an LLM could be legal in one country or jurisdiction but not another. By accepting code that is written using LLMs into Git, we expose our contributors (who implicitly distribute Git code by uploading it to servers) and distributors (such as Linux distros or their distributors) to potential liability if the use of a particular LLM or LLMs in general are found to be illegal in their jurisdiction. Unlike most of the companies that develop LLMs, most contributors and distributors of Git are individuals or non-profits with limited resources. Even as someone who works in the tech industry and is paid accordingly, defending a copyright claim would be extremely expensive and probably financially devastating for me and I really do not want to take that risk. That's why simply declaring LLM use is not acceptable: because it exposes others who have limited resources to legal risk. Note that ripping it out afterwards would require rewriting the Git history and would not solve the problem of all of the people who are distributing or using older versions (which would have been judged to violate copyright law) or relieve them of the fact that they would have been exposed to legal liability for their distribution. The avoidance of legal problems is why we require sign-off. If Developer X signs off a patch that was later judged to violate copyright law, then they have made a legally binding statement to that effect and they have effectively accepted the entire legal liability for that[0]. If we don't believe people can legally make certain types of contributions, then we should explicitly tell people that they should not make that legal statement to avoid any ambiguity. This is very different from situations where companies make a decision to incorporate LLM-generated code into their own codebases. They can hire lawyers to determine whether LLM-generated code is legal in their given jurisdiction and obtain whatever legal necessities are required to operate in compliance with the law. They also usually have substantial resources to address any problems that come up. We, on the other hand, are effectively a global project, must engage in behaviour that is legal in all or nearly all jurisdictions, and have very limited resources. > That reads like a full stop rejection of all AI generated patch content. > > What if AI were to generate a great patch whose technical quality is exemplary > in every way? How is that any different from a great patch of exemplary > technical quality submitted by a person who is unambiguosly evil? There are a couple of problems here: one, some AI code (including documentation or other text) is of poor quality; two, regardless of the quality, many people submit AI-generated code they do not understand; and three, AI-generated code is a legal minefield. A technically great patch solves the first but not the other two. We still need people who submit code to be able to explain their changes and respond to questions about the code. What decisions were made? Why were they made? What are the tradeoffs and downsides? > Taking your words at face value, the prior paragraph reads as if the Git > project is declaring an outright ban on _all_ AI generated content (and I am > nearly certain that is _not_ what you intended to say). If so, why bother > continuing on with a PSA (Public Safety Announcement)? It reads like a > non-alcoholic drink that has the words, "Drink Responsibly" printed on the side > of the can. I think this is actually what they intended to say, but did so poorly. I agree clarification would be valuable. > AI is not going away, and we need to find a way to use it productively > _without_ losing our sense of self-reliance. If we fail to develop this ability > when AI is hardly more skilled than an above average intern, full of hubris and > zero real world experience, imagine how unqualified we will be when AI becomes > competent enough to manipulate and mislead us? I think you assume LLMs can have intelligence. They are glorified prediction engines, effectively fancy Markov chains. In some cases, that can be useful and valuable and we can do interesting things with them, but they cannot actually have intelligence, creativity or reason. And LLMs already manipulate and mislead people. They have been implicated in goading teenagers to suicide or leading people into conspiracy theories. Some LLMs espouse racist, anti-Semitic, or otherwise hateful views. That's a good reason to be wary of them and how they're incorporated to our lives, at least until such a time that they have appropriate safety measures and regulation in place (if that ever happens). [0] I refer you to the common-law doctrine of promissory estoppel. -- brian m. carlson (they/them) Toronto, Ontario, CA