

Researchers: “Not my department”


Researchers: “Not my department”


I’m not talking about Linux.
The context of my reply is about LLM generated code and the downstream use of it in a product.
See:
LLMs themselves being products of copyright isnt the legal question at issue, it’s the downstream use of that product.
Assuming that the code is 100% LLM generated and uncopyrightable does not affect the ability to enforce license restrictions created via End User Licensing on downstream uses of that product.
A piece of software that is unable to be copyrighted due to being 100% generated can be licensed and can expect to have that license enforced via contract law.


Well, that isn’t the case now and isn’t likely to be the case anytime in the near future.
The rules are not written in stone and future Linus will have a better idea of the capabilities of future AI and can change the rules accordingly, as he has done since the beginning, in order to steer the Linux project in the right direction.


It doesn’t have to be open source.
If someone 100% generates code to make software then the software isn’t protected by copyright.
That software could be distributed and licensed under an EULA and the fact that it isn’t protected by copyright means absolutely nothing as far as the EULA is concerned.
The copyright status and the ability to license a piece of software under contract law do not depend on one another.


The short answer is that this is a slippery slope argument.
The long answer is:
In this hypothetical future where 95% of the Linux kernel is AI generated, it stands to reason that generating an OS kernel is possible (by definition of the hypothetical).
If generating a full OS kernel is possible then people could generate a fully closed source kernel without using any of the 5% of GPL protected code in the Linux kernel.
If you allow that it’s possible for AI to create a kernel with AI generated code then it will happen regardless of the status of the Linux kernel’s copyright protections.


Well, cynically, the Supreme Court will decide and Team AI has more money to buy RVs and luxury vacations.


You’re right, I misread the context (I was trying to carry on multiple simultaneous conversations).
My apologies.


because you don’t own the copyrights, so you can’t sue anyone for copyright infringement.
You can’t sue for copyright infringement.
You can, however, use content which is not able to be copyrighted and also still license (under contract law/EULAs) your product including terms prohibiting copying of the non-copyrightable information.
This was settled in: https://en.wikipedia.org/wiki/ProCD%2C_Inc._v._Zeidenberg
On Zeidenberg’s copyright argument, the circuit court noted the 1991 Supreme Court precedent Feist Publications v. Rural Telephone Service, in which it was found that the information within a telephone directory (individual phone numbers) were facts that could not be copyrighted. For Zeidenberg’s argument, the circuit court assumed that a database collecting the contents of one or more telephone directories was equally a collection of facts that could not be copyrighted. Thus, Zeidenberg’s copyright argument was valid. However, this did not lead to a victory for Zeidenberg, because the circuit court held that copyright law does not preempt contract law. Since ProCD had made the investments in its business and its specific SelectPhone product, it could require customers to agree to its terms on how to use the product, including a prohibition on copying the information therein regardless of copyright protections
You can’t copyright phone numbers, just like you can’t copyright generated code, but you can still create a license which protects your uncopyrightable content and it can be enforced via contract law.


I don’t see the problem. GPL protects all of the code that is copyrighted, i.e. 100% made by humans. Accepting a submission created with AI tools doesn’t change this. It’s not going to be a simple task for someone who has decided to violate the GPL license to only use the generated/uncopyrighted portions without using any other GPL code and thus being subject to GPL licensing terms.
These hypothetical GPL violating people will have a hard time using lines 27-38 of ./kernel/events/ring_buffer.c to do anything even if they technically can do so without releasing their code under the GPL. If they use any piece of GPL code, at all, anywhere, their entire project is required to follow the GPL. So while they could, technically, take 27-38 of ring_buffer.c and build an entire proprietary non-GPL Linux kernel… it is, in practice, not feasible even if it technically possible.


That is the FSF’s position, but the case law has examples of cases where it was allowed to be treated by a contract.
SFC v. Vizio, the Software Freedom Conservancy sued Vizio as a third-party beneficiary of the GPL as a contract, and the court allowed the case to proceed on that theory.


You’re confusing two separate legal issues.
Copyright is created and enforced by copyright law.
Licenses are created and enforced by contract law.
You can violate a contract without violating a copyright and you can violate a copyright without agreeing to a license. You can also license works that are not able to be protected by a copyright because they are two separate categories of law.


If I use a copyright-infringing work as a part of a new creative work, does that new work infringe copyright by default?
No, see reaction content, parody content, etc. They all undoubtedly use copyrighted work and they don’t automatically infringe on copyright by default.
And if it is judged as infringing, who is responsible for the damage done? Can I pass the damages back to the original infringing work? Or should I be held responsible for not performing due diligence?
The infringing party is the human that used the tool which generated the infringing work. Everything after that is exactly the same applicaton of copyright law just as if you were selling pictures of Mickey Mouse that you drew yourself. Disney can sue you, they can’t sue the pencil manufacturer.


The status of generated code is ‘uncopyrightable’, which can be licensed.
Copyright law determines the copyright status and contract law enforces the terms of contracts. They are two separate issues.
If someone licenses you to use their AI generated code and you violate the license agreement, it doesn’t matter that they don’t have a claim under copyright law. They have a claim under contract law due to you violating the terms of the license (which is a contract).


Most GenAI users do not submit code to the Linux kernel project.


Plagiarism and copyright violation are two different things, one is ethical and the other is legal.
Copyright has a body of case law which helps determine when a work significantly infringes on the copyrighted work of another. Plagiarism has no body of law at all, it is an ethical construct and not a legal one.
You can plagiarize something that has no copyright protection and you can infringe on copyright protection without plagiarizing. They’re not interchangeable concepts.
In your example, some institutions would not allow such a device to operate on their property but it would not be illegal to operate and the liability would be on the person and not on the oven.
To further strain the metaphor, Linus is saying that you can use (possibly) exploding ovens, because he isn’t taking a moral stance on the topic, but you are responsible for the damages if they cause any because the legal systems require that this be the case.


Copyright and License terms are two different categories of law. Copyright is an idea created and enforced by the laws of the country which has jurisdiction. Licenses are a contract between two parties and is covered by contract law.
A thing can be unable to be protected by copyright and also protected by the terms of the license that it is provided under. If a project contains copyrighted code that does not mean that you cannot be held to the terms of the license. Your use of licensed works is granted under the agreement that you follow the terms of the license. You cannot be held liable for copyright violations for using the code, but using the code in a manner that is not allowed by the license makes you liable for violation of the contract that is the license agreement.


It’s extremist to take the fact that you CAN get plagiaristic output and to conclude that all other output is somehow tainted.
You personally CAN quote copyrighted music and screenplays. If you’re an artist then you also CAN produce copyright violating works. None of these facts taint any of the other things that you produce that are not copyright or plagiarized.
In this situation, and in the current legal environment, the responsibility to not produce illegal and unlicensed code is on the human. The fact that the tool that they use has the capability to break the law does not mean that everything generated by it is tainted.
Photoshop can be used to plagiarize and violate copyright too. It would be just as absurd to declare all images created with Photoshop are somehow suspect or unusable because of the capability of the tool to violate copyright laws.
The fact that AI can, when specifically prompted, produce memorized segments of the training data has essentially no legal weight in any of the cases where it has been argued. It is a fact that is of interest to scientists who study how AI represent knowledge internally and not any kind of foundation for a legal argument against the use of AI.


Given the research that you’ve done here I’m going to assume that you’re looking for an answer and not simply taking us on a gish gallop.
Your premise, and what appears to be the primary source of confusion, is built on the idea that this is ‘stolen’ work which, from a legal point of view, is untrue. If you want to dig into why that is, look into the precedent setting case of Authors Guild, Inc. v. Google, Inc. (2015). The TL;DR is that training AI on copyrighted works falls under the Fair Use exemptions in copyright law. i.e. It is legal, not stealing.
The case you linked from Munich shows that other country’s legal systems are interpreting AI training in the same way. Training AI isn’t about memorization and plagiarism of existing work, it’s using existing work to learn the underlying patterns.
That isn’t to say that memorization doesn’t happen, but it is more of a point of interest to AI scientists that are working on understanding how AI represents knowledge internally than a point that lands in a courtrooom.
We all memorize copyrighted data as part of our learning. You, too, can quote Disney movies or Stephen King novels if prompted in the right way. This doesn’t make any work you create automatically become plagarism, it just means that you have viewed copyrighted work as part of your learning process. In the same way, artists have the capability to create works which violate the copyright of others and they consumed copyrighted works as part of their learning process. These facts don’t taint all of their work, either morally or legally… only the output that literally violates copyright laws.
The pragmatism here is recognizing that these tools exist and that people use them. The current legal landscape is such that the output of these tools is as if they were the output of the users. If an image generator generates a copyrighted image then the rightsholder can sue the person, not the software. If a code generator generates licensed code then the tool user is responsible.
This is much like how we don’t restrict the usage of Photoshop despite the fact that it can be used to violate copyright. We, instead, put the burden on the person who operates the tool
That’s what is happening here. Linus isn’t using his position to promote/enforce/encourage LLM use, nor is he using his position to prevent/restrict/disallow any AI use at all. He is recognizing that this is a tool that exists in the world in 2026 and that his project needs to have procedures that acknowledge this while also ensuring that a human is the one responsible for their submissions.
This is the definition of pragmatism (def: action or policy dictated by consideration of the immediate practical consequences rather than by theory or dogma).
e: precedent, not president (I’m blaming the AI/autocorrect on this one)
That may be what you were talking about, but you replied to me and I was not having a conversation about Linux.
I know, I asked myself.