On 2025-10-01 at 18:59:31, Chuck Wolber wrote:
> 1. It repeats what is said just a few paragraphs earlier in the document. I
> understand _why_ it does this, but moving the essence of this topic up to the
> DCO section avoids the repetition and avoids diluting the project's legal
> guidance.
> 
> 2. What am I supposed to do with "It's not yet clear"? This is worse than
> telling me nothing. It introduces a vague question with no clear guidance. It
> is _true_ that no clear guidance exists, but what are the consequences when it
> _does_ exist? The worst case scenario is that we have to go back and
> rework/remove AI generated patches. So why not just require something like a
> declaration of AI content like the one proposed at declare-ai.org?

I agree that this is unclear, which is why I suggested we be more
definitive.

Many of the companies that develop LLMs are headquartered in the United
States.  Many of the people that contribute to Git or distribute Git are
not.  For instance, I am located in Canada, which has different
copyright laws (we have the more limited fair dealing like the UK,
instead of the US's fair use) and has moral rights.  It is entirely
possible that the use of an LLM could be legal in one country or
jurisdiction but not another.

By accepting code that is written using LLMs into Git, we expose our
contributors (who implicitly distribute Git code by uploading it to
servers) and distributors (such as Linux distros or their distributors)
to potential liability if the use of a particular LLM or LLMs in general
are found to be illegal in their jurisdiction.  Unlike most of the
companies that develop LLMs, most contributors and distributors of Git
are individuals or non-profits with limited resources.  Even as someone
who works in the tech industry and is paid accordingly, defending a
copyright claim would be extremely expensive and probably financially
devastating for me and I really do not want to take that risk.

That's why simply declaring LLM use is not acceptable: because it
exposes others who have limited resources to legal risk.  Note that
ripping it out afterwards would require rewriting the Git history and
would not solve the problem of all of the people who are distributing or
using older versions (which would have been judged to violate copyright
law) or relieve them of the fact that they would have been exposed to
legal liability for their distribution.

The avoidance of legal problems is why we require sign-off.  If
Developer X signs off a patch that was later judged to violate copyright
law, then they have made a legally binding statement to that effect and
they have effectively accepted the entire legal liability for that[0].  If
we don't believe people can legally make certain types of contributions,
then we should explicitly tell people that they should not make that
legal statement to avoid any ambiguity.

This is very different from situations where companies make a decision
to incorporate LLM-generated code into their own codebases.  They can
hire lawyers to determine whether LLM-generated code is legal in their
given jurisdiction and obtain whatever legal necessities are required to
operate in compliance with the law.  They also usually have substantial
resources to address any problems that come up.  We, on the other hand,
are effectively a global project, must engage in behaviour that is legal
in all or nearly all jurisdictions, and have very limited resources.

> That reads like a full stop rejection of all AI generated patch content.
> 
> What if AI were to generate a great patch whose technical quality is exemplary
> in every way? How is that any different from a great patch of exemplary
> technical quality submitted by a person who is unambiguosly evil?

There are a couple of problems here: one, some AI code (including
documentation or other text) is of poor quality; two, regardless of the
quality, many people submit AI-generated code they do not understand;
and three, AI-generated code is a legal minefield.

A technically great patch solves the first but not the other two.  We
still need people who submit code to be able to explain their changes
and respond to questions about the code.  What decisions were made?  Why
were they made?  What are the tradeoffs and downsides?

> Taking your words at face value, the prior paragraph reads as if the Git
> project is declaring an outright ban on _all_ AI generated content (and I am
> nearly certain that is _not_ what you intended to say). If so, why bother
> continuing on with a PSA (Public Safety Announcement)? It reads like a
> non-alcoholic drink that has the words, "Drink Responsibly" printed on the side
> of the can.

I think this is actually what they intended to say, but did so poorly.
I agree clarification would be valuable.

> AI is not going away, and we need to find a way to use it productively
> _without_ losing our sense of self-reliance. If we fail to develop this ability
> when AI is hardly more skilled than an above average intern, full of hubris and
> zero real world experience, imagine how unqualified we will be when AI becomes
> competent enough to manipulate and mislead us?

I think you assume LLMs can have intelligence.  They are glorified
prediction engines, effectively fancy Markov chains.  In some cases,
that can be useful and valuable and we can do interesting things with
them, but they cannot actually have intelligence, creativity or reason.

And LLMs already manipulate and mislead people.  They have been
implicated in goading teenagers to suicide or leading people into
conspiracy theories.  Some LLMs espouse racist, anti-Semitic, or
otherwise hateful views.  That's a good reason to be wary of them and
how they're incorporated to our lives, at least until such a time that
they have appropriate safety measures and regulation in place (if that
ever happens).

[0] I refer you to the common-law doctrine of promissory estoppel.
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA