Recent comments in /f/MachineLearning

sam__izdat t1_jch4kn0 wrote

It is a "structured thing" because it has concrete definable grammatical rules, shared across essentially every language and dialect, and common features, like an infinite range of expression and recursion. If language didn't have syntactic structure we'd just be yelling signals at each other, instead of doing what we're doing now. There would be nothing for GPT to capture.

1

ilrazziatore t1_jch3vpu wrote

Uhm..... the bnn are built assuming distribution both on th parameters( ie the value assumed by the neurons weights) and on the data (the last layer has 2 outputs : the predicted mean and the predicted variance. Those 2 values are then used to model the loss function which is the likelihood and is a product of gaussians. I think its both model and data uncertainty.

Let's say I compare the variances and the mean values predicted.

Do I have to set the same calibration and test dataset apart for both models or use the entire dataset? The mcmc model can use the entire dataset without the risk of overfitting but for the bnn it will be like cheating

1

BrotherAmazing t1_jch3dkl wrote

I agree with your sentiment and have no problem with that.

There just seem to be more than one or two people here with the idea that Corporate entities have generally been publishing a higher % of their R&D than they actually ever did though. Some people (not saying you personally) seem to go farther and believe it is their duty to publish important IP and research.

I like them publishing and think it’s great, but just believe they never have a “duty” to do so if they don’t want to and have seen companies that “publish” behind the scenes hold a lot back too.

1

BrotherAmazing t1_jch23xt wrote

In the real-world cases I have been involved in, granted it was only four cases, things did not at all play out that way. Once it went to court but the defendant settled on terms favorable to the plaintiff, once the defendant complies with the cease and desist prior to the lawsuit being initiated, and the other two times actually went to trial and weren’t settled (which they told me was rare) with the plaintiffs winning once and the defendants winning once.

What you say really is not true because once you win or lose in court, it cannot be tried again and it’s a settled matter, and that process indeed does legally settle whether there is infringement or not. No one sits around after the verdict is read and scratches their head, wondering whether they are infringing or not.

1

BrotherAmazing t1_jch1gll wrote

I never said they don’t publish, re-read.

I can tell you firsthand what they publish has to get approval, and a lot of things do jot get approval to publish and are held as trade secrets. It boggles my mind this sub clearly has so many people who have never worked on the Corporate side of this industry and have these strong ideas that the Corporate side is or has ever been fully transparent and allows employees to publish anything and everything. The is so far from the truth it’s not funny.

For every model and paper published, there exists another model and many other papers that are not approved to be published and many exist in a different format as internal publications only. Other internal publications get watered down and a lot of extra work is omitted in order to get approval to publish. or they publish “generation 3” to the world while they’re working on “generation 5” internally.

1

sam__izdat t1_jch1c32 wrote

I'm familiar with the terms, but saying e.g. "imaginary numbers don't exist because they're called imaginary" is not making a meaningful statement. All you've said is that German is not C++, and we have a funny name for that. And that's definitely one of the fuzzier interactions you can have about this, but I'm not sure how it proves that natural languages (apparently? if I'm reading this right...) lack structure.

1

LeN3rd t1_jcgzk3c wrote

If it is model uncertainty, the bnn should only assume distributions only for the model parameters, no? If you make the samples a distribution, you assume data uncertainty. Also I do not know exactly what you other model gives you, but as long as you get variances, I would just compare those at first. If the models give vastly different means, you should take that into account. There is probably some nice way to add this ensemble uncertainty with the uncertainty of the models. Also this strongly means that one model is biased and does jot give you a correct estimate of the model uncertainty.

1

ilrazziatore t1_jcgy9ya wrote

Model uncertainty. One model is a calibrated bnn ( i splitted the dataset in a training, a calibration and a test set), the other model is a mathematical model developed considering some physical relation. For computational reasons the bnn assume iid samples normally distributed around their true values and maximize the likelihood (modeled as a product of normal distribution), the mathematical model instead rely on 4 coefficients and is fitted using Monte Carlo with a multivariate likelihood with the full covariance matrix. I wanted to compare the quality of the model uncertainty estimates but I don't know if I should do it on the test dataset for both. Afterall, models calibrated with mcmc methods do not overfit so why split the dataset?

1

bartturner t1_jcgvl8h wrote

> I agree, but what I'm saying is that Deepmind is gonna stop publishing their good stuff. And it's not because of OpenAI.

I do not believe that will happen. But the behavior of OpenAI does not help.

But Google has been more of a leader than a follower so hopefully the crappy behavior by OpenAI does not change anything.

I think the sharing of the research papers was done for a variety of reasons.

First, I fully agree to keep and retain talent. Which Google understood before others that was going to be critical. Why they were able to get DeepMind for $500 million and that would by easily 20x that today.

But the other reason is data. Nobody has more data than Google and also access to more data.

Google has the most popular web site in history and then the second most popular in addition. Then they also have the most popular operating system in history.

So if everyone had access to the same models it still keeps Google in a better position.

But the other reason is Google touches more people than any other company by a wide margin. Google now has 10 different services with over a billion daily active users.

Then the last reason is their hope that someone would not get something they can not get. I believe Google's goal from day 1 has always been AGI. That is what search has been about since pretty much day 1.

They worry that someone will figure it out in some basement somewhere. Very unlikely. But possible. If they can help drive a culture of sharing then it is far less likely to happen.

1

existential_one t1_jcgur9j wrote

I agree, but what I'm saying is that Deepmind is gonna stop publishing their good stuff. And it's not because of OpenAI.

IMO ml research papers weren't profitable before, and companies benefited for the collective effort, plus to retain talent. But now we're seeing ML models having huge impact on companies and single incremental papers can actually improve the bottom line, so all companies are gonna start closing their doors

5

bartturner t1_jcgu47z wrote

Love how much DeepMind shares with the papers. Same with Google Brain.

To me the issue is OpenAI. What makes it worse is they use breakthroughs from DeepMind, Google Brain and others and then do not share.

We call them filtches

7

LeN3rd t1_jcgu1z5 wrote

This is possible in multiple ways. Old methods for this would be to view this as an inverse problem and apply some optimization method to it, like ADMM or FISTA.

If lots of data is missing (in your case the complete R&G channels) you should use a neural network for this. You are on the right track, though it could get hairy. If you have a prior (You have a dataset and you want it to work on similar images), a (cycle) GAN, or a retrained Stable diffusion model could work.

I am unsure about VAEs for your problem, since you usually train them by having the same input and output. You shouldn't enforce the latent to be only the blue channel, since the the encoder is useless. Training only the decoder site is essentially what GANs and diffusion networks do so i would start there.

1

LeN3rd t1_jcgsjxq wrote

define probabilistic. Is it model uncertainty, or data uncertainty? Either way you should get a standard deviation from your model (either as an output parameter, or implicitly by ensembles), that you can compare.

1

No_Complaint_1304 t1_jcgshk7 wrote

Well I did expect this but still month’s! I’ll look into everything you mentioned. And I’ll drop the project for now, if I can’t finish it by studying heavily, I might as well learn slowly but surely, absorb all the information and then go back to make a project that involve predictions and analyzing data. ty4ur help

1