You have a point but 1) I don't think being able to query them and reconstruct i...

Kim_Bruning · 2025-02-02T02:08:53 1738462133

It sounds like you’re sort of starting from the position that AI is inherently unjust and then reasoning backward to justify it. But shouldn’t the argument start with actual harm rather than assumed unfairness?

martin-t · 2025-02-02T02:21:00 1738462860

I wouldn't say that.

My point is that any situation where person A puts in a certain amount of work (normalized by skill, competence, etc.), person B uses person A's work, puts in some work of his own but less than A, then gets more reward than A, is fundamentally unfair.

LLMs are this, just at a massive scale.

But to be honest, this is where the discussion went only after thinking about it for a while. My real starting point was that when I publish my code under AGPL, I do it because I want anyone who builds on top of it to also have to release their code so users have the freedom to modify it.

LLMs are used to launder my code, deprive me of credit and deprive users of their rights.

Can we agree this is harm?

I also believe than unfairness is fundamentally the same as harm, just framed a bit differently.

Kim_Bruning · 2025-02-02T02:45:32 1738464332

> LLMs are used to launder my code,

> deprive me of credit

> and deprive users of their rights.

> Can we agree this is harm?

I might consider it if any of those claims were true.

I think the opposite is true -- especially with Open-Weight models which expand user freedoms rather than restricting them. I wonder if we can get the FSF to come up with GPL compatible Open-Weight licenses.

At this point in time I'm not entirely convinced they even need to. But if future lawsuits turn out that way, it might solve issues with some models.

martin-t · 2025-02-02T03:44:54 1738467894

> I might consider it if any of those claims were true.

Please step back and consider what you are replying to.

> LLMs are used to launder my code

If an LLM was trained only on AGPL code, would it have to be licensed under AGPL? Would its output?

> deprive me of credit

They _obviously_ deprive me of credit. Even if an LLM was trained entirely on my code, nobody using its output would know. Compare to using a library where my name is right there in the license.

> and deprive users of their rights.

I appeal to you again, re-read my comment. I am not talking about users of the model but users of the software that is in part based on my AGPL code. If my code got there traditionally by being included in a library, the whole software would have to be AGPL and users would have the right to modify. If my code is laundered through an LLM, users of my code lose that right.

> Can we agree this is harm?

So all of those things are true. And this is clearly harm.

Stealing a little from everyone is morally no different than stealing a lot from one person. Whenever you think about ML, consider extreme cases such as training on data under one license and all the arguments to pretend it's not copyright infringement fall apart. (And if you don't think extreme cases set precedent to real cases, then please point out where exactly you draw the line. Give me a number.)

Spreading the harm around means everyone is harmed similarly but that is not the kind of fairness I had in mind.