Recent comments in /f/MachineLearning

Narootomoe t1_jds6i3w wrote

Reply to comment by enryu42 in [D] GPT4 and coding problems by enryu42

Thats a good way to put it I don't think I've seen yet, may I steal it?

"If a human had instant recall to all the knowledge GPT4 has, it wouldn't stumble on any of these problems", something like that

3

BeautifulLazy5257 t1_jds5lt4 wrote

How does ReAct work. Is it just a type of prompt engineering that directs the model to choose between a few tool descriptions?

Is it a type of sentiment analysis that chooses?

How can I recreate ReAct-iveness from scratch? What does the workflow look like

2

uspmm2 t1_jds5ium wrote

It's interesting but the title should have mentioned its about code contests problems, that are things nobody writes in the real world

2

enryu42 OP t1_jds49mm wrote

Reply to comment by maskedpaki in [D] GPT4 and coding problems by enryu42

Well, they do, and quite successfully, this is what these sites are about...

Of course if you ask some frontend engineer to solve some math-y problem, they'll be confused. But this is simply because they lack knowledge, and GPT4 evidently doesn't have this issue. Moreover, I doubt any human programmer will have troubles with the "Beginner" problems, regardless of their specialization.

15

light24bulbs t1_jds3mdl wrote

What's the underlying approach here? Just prompt engineering right?

I really really want to apply the ToolFormer paper to llama. They're both Facebook systems, you can get they've done it.

ToolFormer just seems like SUCH a good and thorough approach. There are quite a few gaps between the paper and building a working example, IMO, but it's clearly doable.

The way Facebook licensed the weights is frustrating me. We should all be passing around Alpaca trained, GPTQ quantized, SparseGpt optimized Llama derived models by now. Is there some telegram group i need to be in or something?

18

liqui_date_me t1_jds3b6q wrote

Reply to comment by ngildea in [D] GPT4 and coding problems by enryu42

I would say it's controversial around many folks who aren't directly involved in programming and who get impressed by cute demos on Twitter. People who actually know how to code see it as a superpower to make themselves more efficient, while also lamenting about how it makes silly mistakes.

https://www.reddit.com/r/cscareerquestions/comments/1226hcn/im_worried_about_ai_taking_our_jobs/

I highly doubt that software engineering jobs will become obsolete. There's going to be a lot of disruption and there might be some wage deflation too (imagine the price of writing the boilerplate components of an iOS app goes from 50,000 dollars to 50 dollars), but so much of software engineering is testing, QA and human collaboration. I think we're just going to have to re-orient our careers around correcting code from LLMs.

6

enryu42 OP t1_jds3452 wrote

Reply to comment by lambertb in [D] GPT4 and coding problems by enryu42

I absolutely agree that it is useful. Even CoPilot is amazing at autocompleting "dumb" boilerplate code, which is a nontrivial amount of the code overall. However, these problems are designed to be challenging (these are competitions after all), and require ideas/intelligence to be solved. Apparently GPT4 cannot do it at all, so IMO it would be a stretch to call whatever it is doing "intelligence".

18

lambertb t1_jds24dr wrote

It cannot solve all coding problems. But it can solve many problems. And if the user is reasonably experienced, even code with errors is useful because they can quickly be corrected. Preliminary evaluations show a 40% increase in developer productivity from GitHub Copilot. And that seems totally plausible to me.

57

ghostfaceschiller t1_jds202e wrote

Reply to comment by enryu42 in [D] GPT4 and coding problems by enryu42

Yeah it's essentially that at an automated level. Tbh it is powerful enough based on results so far that would actually be really surprised if it did not yield very significant gains in these tests.

I'm sure there will be a paper out doing it in like the next few days, so we'll see

15

ThirdMover t1_jds1kid wrote

The lead may not always be obvious and the trade off from transparency may be worth it. LLMs (or rather "foundation models") will continue to capture more and more areas of competence. If I want one that - for example - forms the front end chat bot to a store I have so that people can ask for product explanations, do I need then the 500 IQ GPT-7 that won two Nobel prizes last year?

I think it's most likely that there will always be black box huge models that form the peak of what is possible with machine intelligence but what people use and interact with in practice will simply be "good enough" smaller and open source models.

20

enryu42 OP t1_jds18j4 wrote

I absolutely agree, however, these models repeatedly exceeded expectations (e.g. 5 years ago I thought that "explaining jokes" would be a hard problem for them, with a similar reasoning...)

I tried that because I've heard that there are people inside competitive programming community claiming that GPT4 can solve these problems. But from what I gather, it is still not there.

12