Recent comments in /f/MachineLearning

1998marcom t1_jcf3er8 wrote

Detail note: to the best of my knowledge, as for what OpenAI is doing right now with their software, they could very well be using GPL code in their stack, and they wouldn't be violating any of the GPL clauses. A stricter licence such as AGPL I guess would be needed to cover as usage cases not only the shipping of software to the customer but also the mere utilization of the software.

1

suflaj t1_jcevhag wrote

I understand what the user is saying, I do not understand how it relates to anything said before that.

Sadly, while GPT4 may be able to predict the next token based on studying the language's syntax thoroughly, it still fails to actually understand anything. Unless the original commenter is a bot, I would expect them to explain how what they said has anything to do with my comment, or the claims made about NLP researcher's obsolescence due to its release.

−2

SvenAG t1_jceuwcf wrote

I don't think that we are generally in the era of closed research now - there are still many companies sharing their research, ideas and concepts. If you are interested, we are trying to build an open alternative currently: https://opengpt-x.de/en/

But we are still in early stages

1

AccountGotLocked69 t1_jceti2q wrote

I mean... If this holds true for other benchmarks, it would be a huge shock for the entire community. If someone published a paper showing that AlexNet beats ViT on imagenet if you simply train it for ten million epochs, that would be insane. That would mean all the research into architectures we did in the last ten years can be replaced by a good hyperparameter search and training longer.

2

AccountGotLocked69 t1_jcesw8m wrote

I assume by hallucinate gaps you mean interpolate? In general it's the opposite, smaller simpler models are better at generalizing. Of course there are a million exceptions to this rule, but in the simple picture of using stable combinations of batch sizes and learning rates, big models will be more prone to overfit the data. Most of this rests on the assumption that the "ground truth" is always a simpler function than memorizing the entire dataset.

2

sam__izdat t1_jceowxm wrote

Ridiculously unfounded claim based on a just plain idiotic premise. Children don't learn language by cramming petabytes of text documents to statistically infer the most plausible next word in a sentence, nor do they accept input with arbitrary syntactic rules. Right or wrong, the minimalist program and Merge offer a plausible partial explanation for a recent explosion of material culture -- which did not happen gradually or across multiple species -- consistent with what we can observe in real human beings. GPT, on the other hand, is not a plausible explanation for anything in the natural world, and has basically nothing inherently to do with human language. He's not wrong that it's a bulldozer. It will just as happily accommodate a made-up grammar that has nothing in common with any that a person could ever use, as it would English or Japanese.

> Chomsky et al. 2023 tilt at an imagined version of these models, while ignoring the fact that the real ones so aptly capture syntax, a success Chomsky and others have persistently claimed was impossible.

Exactly the opposite is true. Transformers are general-purpose computers that will gobble up almost anything you can throw at them. His objection was to the "defect" that it will capture any arbitrary syntax, which means it isn't interesting or helpful to cognitive scientists -- just like like a backhoe doesn't offer any insight into how people, in biological terms, are able to lift heavy objects. What he said was impossible, when approached about it decades ago, was to do these things without resorting to brute force in the absence of an actual theoretical framework and computational model for how language works in the brain. That statement is just as correct today as it was in the 1950s and the rigorous theory of "let's keep cramming in data and stirring the big ol' pot of linear algebra until candy comes out" doesn't do anything to to change that picture.

> In particular, we may view each model or set of modeling assumptions as a possible hypothesis about how the mind may work. Testing how well a model matches humanlike behavior then provides a scientific test of that model’s assumptions.

I see. So, by that reasoning, Deep Blue affirmed our hypothesis that Kasporov got really good at chess by evaluating 200,000,000 positions per second. Nonsense on stilts.

> Hosseini et al. 2022 present evidence that models trained on developmentally-plausible amounts of data already capture human neural responses to language in the brain

If you define "developmentally plausible" as "100 million tokens" and "capture human neural responses" as "fasten together a linear regression model to make statistically significant correlations with some coarse fMRI activity."

> After decades of privilege and prominence in linguistics, Noam Chomsky’s approach to the science of language is experiencing a remarkable downfall. The story is, in part, a cautionary tale about what happens when an academic field isolates itself from what should be complementary endeavours.

Sound advice for overconfident doe-eyed futurists guzzling the silicon valley kool aid and assuming they can dispense with common sense and just wing it with anthropology, biology and cog sci.

edit - and, by the way, it's actually kind of impressive for a person to miss this point so completely:

> Generate ten other sentences like [colorless green ideas]

> Purple fluffy clouds dream wildly

> Blue glittery unicorns jump excitedly

> Orange sparkly penguins waddle energetically

> Red shimmering elephants trumpet melodiously

> Yellow dazzling lions roar majestically

> Green glittery monkeys swing energetically

> Pink sparkly giraffes walk gracefully

> Black shiny kangaroos hop playfully

> White glittery bears cuddle affectionately

> Brown shimmering rabbits hop merrily.

Literally none of these cut-and-paste color-texture-animal-verb-adverb responses satisfies the query, and to get even a little bit closer with GPT, you have to explicitly beg for examples of contradictions and category errors.

8