About as open source as a binary blob without the training data

[email protected]

While I completely agree with 90% of your comment, that first sentence is gross hyperbole. I have used a number of pieces of open source options that are are clearly better. 7zip is a perfect example. For over a decade it was vastly superior to anything else, open or closed. Even now it may be showing its age a bit, but it is still one of the best options.
But for the rest of your statement, I completely agree. And yes, CAD is a perfect example of the problems faced by open source. I made the mistake of thinking that I should start learning CAD with open source and then I wouldn't have to worry about getting locked into any of the closed source solutions. But Freecad is such a mess. I admit it has gotten drastically better over the last few years, but it still has serious issues. Don't get me wrong, I still 100% recommend that people learn it, but I push them towards a number of closed source options to start with. Freecad is for advanced users only.

[email protected]

Do you think your comments here are implying an understanding of the tech?

[email protected]

Judging by OP’s salt in the comments, I’m guessing they might be an Nvidia investor. My condolences.

[email protected]

Especially after it was founded as a nonprofit with the mission to push open source AI as far and wide as possible to ensure a multipolar AI ecosystem, in turn ensuring AI keeping other AI in check so that AI would be respectful and prosocial.

[email protected]

A model can be represented only by its weights in the same way that a codebase can be represented only by its binary.

Training data is a closer analogue of source code than weights.

[email protected]

7-zip
VLC
OBS
Firefox did it only to mostly falter to Chrome but Chrome is largely Chromium which is open source.
Linux (superseded all the Unix, very severely curtailed Windows Server market)
Nearly all programming language tools (IDEs, Compilers, Interpreters)
Essentially all command line ecosystem (obviously on the *nix side, but MS was pretty much compelled to open source Powershell and their new Terminal to try to compete)

In some contexts you aren't going to have a lively enough community to drive a compelling product even as there's enough revenue to facilitate a company to make a go of it, but to say 'no open source software has acheived that' is a bit much.

[email protected]

A model is an artifact, not the source. We also don't call binaries "open-source", even though they are literally the code that's executed. Why should these phrases suddenly get turned upside down for AI models?

Fushuan [he/him]

Hey, I have trained several models in pytorch, darknet, tensorflow.

With the same dataset and the same training parameters, the same final iteration of training actually does return the same weights. There's no randomness unless they specifically add random layers and that's not really a good idea with RNNs it wasn't when I was working with them at least. In any case, weights should converge into a very similar point even if randomness is introduced or else the RNN is pretty much worthless.

magic_lobster_party

There’s usually randomness involved with the initial weights and the order the data is processed.

Fushuan [he/him]

Not enough for it to make results diverge. Randomness is added to avoid falling into local maximas in optimization. You should still end in the same global maxima. Models usualy run until their optimization converges.

As stated, if the randomness is big enough that multiple reruns end up with different weights aka optimized for different maximas, the randomization is trash. Anything worth their salt won't have randomization big enough.

So, going back to my initial point, we need the training data to validate the weights. There are ways to check the performance of a model (quite literally, the same algorithm that is used to evaluate weights in training is them used to evaluate the trained weights post training) the performance should be identical up to a very small rounding error if a rerun with the same data and parameters is used.

[email protected]

Nah, just a 21st century Luddite.

[email protected]

It's not like you need specific knowledge of Transformer models and whatnot to counterargue LLM bandwagon simps. A basic knowledge of Machine Learning is fine.

[email protected]

It's even crazier that Sam Altman and other ML devs said that they reached the peak of what current Machine Learning models were capable of years ago

reuters.com

(www.reuters.com)

But that doesn't mean shit to the marketing departments

[email protected]

Sorry, that was a PR move from the get-go. Sam Altman doesn't have an altruistic cell in his whole body.

[email protected]

And you believe you’re portraying that level of competence in these comments?

JackbyDev

Meta's "open source AI" ad campaign is so frustrating.

[email protected]

I at least do.

[email protected]

I mean if you both think this is overhyped nonsense, then by all means buy some Nvidia stock. If you know something the hedge fund teams don’t, why not sell your insider knowledge and become rich?

Or maybe you guys don’t understand it as well as you think. Could be either, I guess.

acargitz

I don't care what Facebook likes or doesn't like. The OSS community is us.

[email protected]

“Look at this shiny.”

Investment goes up.

“Same shiny, but look at it and we need to warn you that we’re developing a shinier one that could harm everyone. But think of how shiny.”

Investment goes up.

“Look at this shiny.”

Investment goes up.

“Same shiny, but look at it and we need to warn you that we’re developing a shinier one that could harm everyone. But think of how shiny.”