social.tchncs.de is one of the many independent Mastodon servers you can use to participate in the fediverse.
A friendly server from Germany – which tends to attract techy people, but welcomes everybody. This is one of the oldest Mastodon instances.

Administered by:

Server stats:

3.9K
active users

tok 🕊️

gives an idea why paid so much for Github. They were after data: Tons of food for their AI, millions of contributors that now 'work' for MS for free.
You publish your code under GPLv3, even AGPLv3? So what? The AI learns from your code and uses it to generate code that is possibly proprietary. Does forbid this practice? (I don't think so)

That's the M$ way to break copyright law.

It's time for alternatives like @codeberg .

@t0k Is there anything that could stop MS from cloning random repos from the web and feed it's AI with that? Nope. That does not sound like a reasonable reason on why they bought GitHub.
Also, that's a bit like claiming that all code that I will write in my life is GPL if I learned to write code from looking at GPL code.

@t0k while I strongly agree with your sentiment, I disagree with your conclusion: the fact is they could have done it just as easily (and cheaper!) by pulling code from CodeBerg, Gitlab, 0xacab, any other place where any code is publicly available.

The food for their AIs is right there for the taking, whether they own GitHub or not.

@t0k We'd need a court case but I don't see any sound argument where this isn't creating output that'd be under the gpl

@t0k

Just had a discussion about this around the water cooler at work (Teams chat), and the instrumentation that shares code back to the M-ship makes it a nonstarter here: security risk.

Nobody can stop anybody from learning from open source code and applying that knowledge to create closed code. Microsoft could have done this any time they wanted without buying Github until Github found out, at which time Github could have ended the experiment.

Buying Github guaranteed continued access.

@t0k Code produced by an AI trained on code is a derivative work, and the copyright still holds.

Their lawyers may think it's worth a gamble, since this hasn't been tested in court yet, but I think there's an excellent chance the findings would be for the owner of the original code.

That said, how would anyone find out?

@raboof Yes! But I haven't been able to think of a suitable way to make trap code—and it's quite likely the generated code would be closed-source anyhow, making discovery less likely.

@lanodan @t0k Do note that the license is to distribute, not to use, and believe you would need a license to use for training a ML model.

@ignaloidas @t0k Uuuh.

> If you set your pages and repositories to be viewed publicly, you grant each User of GitHub a nonexclusive, worldwide license to use, display, and perform Your Content through the GitHub Service and to reproduce Your Content solely on GitHub as permitted through GitHub's functionality (for example, through forking).
> You may grant further rights if you adopt a license. If you are uploading Content you did not create or own, you are responsible for ensuring that the Content you upload is licensed under terms that grant these permissions to other GitHub Users.
@ignaloidas @t0k Actually now that I'm even more carefully reading it than usual… are they throwing out the restrictions on licenses?

@lanodan @t0k If that is the case (which I belive is not) than they can use any AGPL'd code hosted there from what I read. Which is very obviously wrong.

@t0k SourceHut is also really good.