Recent comments in /f/MachineLearning

currentscurrents t1_je14pi5 wrote

Clearly, the accuracy is going to have to get better before it can replace Google. It's pretty accurate when it knows what it's talking about, but if you go "out of bounds" the accuracy drops off a cliff without warning.

But the upside is that it can integrate information from multiple sources and you can interactively ask it questions. Google can't do that.

3

TheEdes t1_je149kf wrote

Yeah but if you were to come up with a problem in your head that didn't exist word for word then GPT-4 would be doing what they're advertising, however, if the problem was word for word anywhere in the training data then the testing data is contaminated. If the model can learn the design patterns for leetcode style questions by looking at examples of them, then it's doing something really good, if it can only solve problems that it has seen before, then it's nothing special, they just overfit a trillion parameters on a comparatively very small dataset.

8

currentscurrents t1_je12d3k wrote

Nobody knows exactly what it was trained on, but there exist several datasets of published books.

>I'm highly surprised I haven't seen any news of authors/publishers suing yet tbh.

They still might. But they don't have a strong motivation; it doesn't really directly impact their revenue because nobody's going to sit in the chatgpt window and read a 300-page book one prompt at a time.

6

DaBobcat t1_je12b4q wrote

Here OpenAI and Microsoft were evaluating GPT4 on medical problems. In section 6.2 they specifically said that they found strong evidence that it was trained on "popular datasets like SQuAD 2.0 and the Newsgroup Sentiment Analysis datasets". In the appendix section B they explain how they measured whether it saw something in the training data. Point is, I think benchmarks are quite pointless if the training dataset is private and no one can verify that they did not train it on the test set, which they specifically said that in many cases it did

5

gigglegenius t1_je10itr wrote

With chatgpt:

- Generate variations of ideas. I type in an idea I have and prompt it to generate variations

- Creative brainstorming for professional work

- Make me laugh by prompting to get some really abstract and surreal story lines

With Bing:

- Search for information in a quick way and have it summarized, it can be hit or miss, but it is getting better at it definitely

9

utopiah t1_je0zqae wrote

Well I just did so please explain why not, genuinely trying to learn. I'd also be curious if you have a list of trained models compared by cost. I only saw some CO2eq order of magnitude equivalent but not rough price estimations so that would help me to get a better intuition as you seem to know more about this.

That being said the point was that you don't necessarily need to train anything from scratch or buy anything to have useful results, you cant rent per hour on cloud and refine existing work, no?

0

RecoilS14 t1_je0x3ud wrote

I’m a new hobbiest programmer and have spent the last month or so learning python (CS50, Mosh, random Indian guys, etc) and currently also watching the Stanford ML/DL lectures on YouTube.

I have started to learn ML, Pytorch, and some Tensorflow, along with how Tensors and vectors works with ML.

I am wondering if anyone can point me in the direction of other aspects of ML/DL/Neural Networks that I may be missing out on. Perhaps a good series that goes in to length on these subjects via lectures and not just to programming side of it so I can further understand the concepts.

I’m sure there’s lots of things I’m missing out on my journey and I some perspective would be nice.

1