2008, me: I love the idea of cryptocurrencyBITCOIN: The word "cryptocurrency" now means "financial scams based on inefficient write-only ledgers"2018, me: I love the idea of the metaverseFACEBOOK: The word "metaverse" now means "proprietary 3D chat pro...
-
Mark T. Tomczakreplied to Graham Spookyland🎃/Polynomial on last edited by
@gsuberland @mcc One almost wonders if the end-game is to stop pulling and try pushing.
Maybe instead of trying to claw back data we've made publicly crawlable because "I wanted it visible, but not like that" we ask why any of these companies get to keep their data proprietary when it's built on ours?
Would people be more okay with all of this if the rule were "You can build a trained model off of publicly-available data, but that model must itself be publicly-available?"
-
@mark @gsuberland In my opinion, a trapdoor like "okay, well if copyright doesn't apply to the training data you stole, your model isn't copyrightable either" is no good. The US Gov has already said GenAI images and text are not copyrightable. It doesn't help. The thing about generative AI is it inherently takes heavy computational resources (disk space, CPU time, often-unacknowledged low-wage tagging work). Therefore, as a tool, it is inherently biased toward capital and away from individuals.
-
@mark @gsuberland If we say "AI is a new class of thing that is outside the copyright regime entirely", that is not a level playing field. The tool is designed in a way it inherently serves the powerful. "Machine learning models are inherently open" is the exact model I am afraid of— a world where copyright is something that applies to actors who have less than some specific amount of money, and anyone with more than that specific amount of money is liberated from it.
-
@mcc @mark @gsuberland Exactly.
Even if, say, GPT-4 wasn't covered by copyright, so what? Even if you could get it out of OpenAI's data centres in the first place, you couldn't run it with reasonable performance. And you *certainly* couldn't retrain it.
-
@mcc you're right to flag that, for sure
-
Irenes (many)replied to Irenes (many) on last edited by
@mcc we definitely think that copyright as a tool for building a better world has bent the structure of capitalism as far as it is going to. we can't afford to REMOVE that crowbar, and in fact we should probably be coming up with more radical copyleft + non-commercial + anti-war licenses, but enforcement is going to keep favoring large power structures, not individuals.
-
Irenes (many)replied to Irenes (many) on last edited by
@mcc (the point of making ever-more-radical licenses is to stay ahead of capitalist attempts to subsume critique into itself)
-
@datarama @mcc @mark @gsuberland there is one upside to forcing these models to be open and it's that it removes one of the, of not the primary, incentives in developing them in the first place. Yes, they could still sell its execution as a service, but if they lose control of the model itself, it becomes a considerably less profitable endeavor.
-
@oblomov @mcc @mark @gsuberland How, though?
Let's say that tomorrow, a judge rules that GPT-4 is not covered by copyright. What has actually changed? OpenAI isn't compelled to share it with anyone, and it's too big for anyone except large and wealthy corporations to actually do anything with.
Sure, you couldn't get sued if you got a bittorrent of it somehow. But you're not getting a bittorrent of a 1.76 trillion parameter neural network anyway.
-
dataramareplied to Graham Spookyland🎃/Polynomial on last edited by
@gsuberland @mcc This isn't why the AI craze has made me anxious, but it *is* why I have become terribly depressed.
I like writing code and making various weird computer programs, and sharing them with people for mutual entertainment and occasional enlightenment. Now I can't do that without accepting that everything I do will be appropriated and commoditized by some of the most horrible people in tech, unless I do it in secret.
And then what's the point?
-
-
-
@datarama @oblomov @mcc @mark @gsuberland 1.76 trillion parameters is about a hard drive's worth of data, no?
-
-
-
@datarama @gsuberland @mcc > And then what's the point?
I'm feeling exactly the same way and I'm really struggling with it.
Not just code but blog posts/tutorials as well. I've "lost" my main creative outlets.
-
@mcc @csolisr @datarama I see what you're saying. It's really tough that it's all gone so bad. The potential here for good is insane and yet because of the immense success of a single company (and the failure to regulate their unlicensed use of this data) the business model for this technology may simply be as exploitative as possible going forward.
If I'm honest it leaves me wanting to jump ship. But then again, all of this could change on the basis of a single court case.
-
@spaduf @csolisr @datarama Not especially looking forward to it because (1) a court case with a "positive" outcome would probably strengthen copyright in general, which I would consider bad for me and (2) In a world where AI tech is already established but a copyright regime where models are unambiguously derivative works is introduced, the biggest name in AI suddenly becomes Disney
-
-