Recent comments in /f/MachineLearning

biggieshiba t1_jdojnn6 wrote

I don't understand why anyone would care, in a few years half the internet will be ai generated. If someone uses GPT-4 to generate a sentence posted on Wikipedia how will you know before using it ? Don't you think many models will use that sentence?

Plus, how will they know, training data is not easy to extract from a model. Except if you are a direct OpenAI competitor they won't ever care or even look at you (well maybe their superAI will).

Lastly the dataset is full of errors, better generate again or even pay people would be quite cheap for 50k examples. This is quite a bad dataset when you really look at it, empty inputs or outputs, unclear instructions, instructions not fit for model... The fact that it is bad and small is very encouraging BTW since it performs pretty well.

2

blose1 t1_jdoj8kl wrote

GPT models struggle with out of distribution programming tasks, which means it can't create novel ideas, I tested this myself many times and it's not a prompt engineering issue. I think LLMs could act as great teachers but not researchers, teachers just teach what we already know, researchers create novel knowledge that teachers use.

7

LahmacunBear t1_jdo7k0w wrote

Here’s a thought — 175B in GPT3 original, the best stuff thrown at it, performs as it did. ChatGPT training tricks, suddenly same size performs magnitudes better. I doubt that currently the LLMs are fully efficient, i.e. just as with GPT3 to 3.5, with the same size we can continue to get much better results, and therefore current results with much smaller models.

1

londons_explorer t1_jdo4kj3 wrote

Think how many hard drives there are in the world...

All of that data is potential training material.

I think a lot of companies/individuals might give up 'private' data in bulk for ML training if they get a viable benefit from it (for example, having a version of ChatGPT with perfect knowledge of all my friends and neighbours, what they like and do, etc. would be handy)

2

farmingvillein t1_jdo16sz wrote

> This 17 page could be a few sentences.

> Tl;DR the authors wrote prompts to tell GPT-4 to fix code given some unit tests and the output of the broken code. It performs better than GPT-4 that doesn't have access to the output of the code execution.

I agree with your overall sentiment--the paper IMO could be, in the very least, substantially re-organized for clarity--but your summary isn't actually accurate, since the paper itself has nothing to do with coding(!).

The coding work is all in their blog post...

...which also suffers from the same issue: a long preamble to scroll down and find the core nugget.

10

_underlines_ t1_jdo0o85 wrote

3

KarlKani44 t1_jdnzo65 wrote

>Plotting the scores the critic assigns for real and fake samples separately? Or do you mean taking mean and standard deviation of the logits for real and fake data and comparing those?

Both ot those work. I like to plot the critic output of real samples into a histogram and then do the same for generated samples. This shows you how well your critic does at separating real from fake samples. You can do this every few epochs during training. You should see that at early epochs those two histograms barely overlap and during the training they will get closer to each other.

It might look like this: https://imgur.com/a/OknV5l0

the left plot is at early training, the right is after some epochs when the critic partially converged. At the end they will overlap almost completely

2

Blutorangensaft OP t1_jdnxm94 wrote

I see, I will improve my critic then (maybe give it more depth) and abstain from tricks like TTUR for now.

What do you mean with "easily seperable distribution of output logits" btw? Plotting the scores the critic assigns for real and fake samples separately? Or do you mean taking mean and standard deviation of the logits for real and fake data and comparing those?

1

sampdoria_supporter t1_jdnwwdd wrote

Does anybody else feel overwhelmed and frozen in the face of all this concurrent development and releases? I can't seem to even jump on much of what is going on because it seems like the next day will just flip the table.

2

farmingvillein t1_jdnwda6 wrote

> But apply those same tricks to a big model, and it works even better.

In general, yes, although there are many techniques that help small models that do not help large ones.

That said, agree with your overall point. I think the only reason we won't see model sizes continue to inflate is if 1) there are substantial underlying architecture discoveries (possible!) or 2) we really hit problems with data availability. But synthetic + multi-modal probably gives us a ways to go there.

2