#Github #Copilot gives an idea why #Microsoft paid so much for Github. They were after data: Tons of food for their AI, millions of contributors that now 'work' for MS for free.
You publish your code under GPLv3, even AGPLv3? So what? The AI learns from your code and uses it to generate code that is possibly proprietary. Does #GPL forbid this practice? (I don't think so)
That's the M$ way to break copyright law.
It's time for alternatives like @codeberg .
@t0k Is there anything that could stop MS from cloning random repos from the web and feed it's AI with that? Nope. That does not sound like a reasonable reason on why they bought GitHub.
Also, that's a bit like claiming that all code that I will write in my life is GPL if I learned to write code from looking at GPL code.
@t0k while I strongly agree with your sentiment, I disagree with your conclusion: the fact is they could have done it just as easily (and cheaper!) by pulling code from CodeBerg, Gitlab, 0xacab, any other place where any code is publicly available.
The food for their AIs is right there for the taking, whether they own GitHub or not.
@t0k We'd need a court case but I don't see any sound argument where this isn't creating output that'd be under the gpl
Just had a discussion about this around the water cooler at work (Teams chat), and the instrumentation that shares code back to the M-ship makes it a nonstarter here: security risk.
Nobody can stop anybody from learning from open source code and applying that knowledge to create closed code. Microsoft could have done this any time they wanted without buying Github until Github found out, at which time Github could have ended the experiment.
Buying Github guaranteed continued access.
@t0k Code produced by an AI trained on code is a derivative work, and the copyright still holds.
Their lawyers may think it's worth a gamble, since this hasn't been tested in court yet, but I think there's an excellent chance the findings would be for the owner of the original code.
That said, how would anyone find out?
@varx makes me think of 'trap streets' :) https://en.wikipedia.org/wiki/Trap_street
@raboof Yes! But I haven't been able to think of a suitable way to make trap code—and it's quite likely the generated code would be closed-source anyhow, making discovery less likely.