polygon_primitive t1_je0x04y wrote on March 28, 2023 at 4:49 PM

Reply to comment by cegras in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

For finding answers it's about the same as Google, sometimes better if you then verify the result with external sources, but that's mainly because Google has so badly corrupted their core search product while chasing profit. It's been pretty useful for me for doing the grunt work writing boiler plate code and refactoring stuff tho

MrFlamingQueen t1_je0w3ut wrote on March 28, 2023 at 4:43 PM

Reply to comment by cegras in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Not sure on the training corpus, but like you mentioned, there's ton of other forms of textbooks and solution manuals to textbook problems on things like github, stackexchange, etc.

truchisoft t1_je0un8g wrote on March 28, 2023 at 4:34 PM

Reply to comment by austacious in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Oh no no, not my argument here, but the whole title wording looks like a sleazy attack, this is not criticism but seems like a hit piece, since like other commenters mention, other independent tests were ran on GPT4 already and people is already using GPT4 for coding.

nomadiclizard t1_je0u5ex wrote on March 28, 2023 at 4:30 PM

Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Haha amateurs. I learned not to make that mistake when I split a pose estimation visual dataset into training and validation, but lots of the frames were almost-duplicates so it got contaminated that way. >.<

fiftyfourseventeen t1_je0u1oj wrote on March 28, 2023 at 4:30 PM

Reply to comment by utopiah in [D] FOMO on the rapid pace of LLMs by 00001746

You can't compare a lora to training a model lol

keepthepace t1_je0u1de wrote on March 28, 2023 at 4:30 PM

Reply to comment by lqstuart in [D] FOMO on the rapid pace of LLMs by 00001746

US is not the only country in the world, maybe they wont be the first one on this thing.

corkbar t1_je0twj9 wrote on March 28, 2023 at 4:29 PM

Reply to [D] With ML tools progressing so fast, what are some ways you've taken advantage of them personally? by RedditLovingSun

Its fun to show off when we have dinner guests over. "Hey check out what my computer can do!"

meister2983 t1_je0s90f wrote on March 28, 2023 at 4:18 PM

Reply to comment by ArnoF7 in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

GPT-4 is an extremely good pattern matcher - probably one of the best ever made. Most exams made seem to be able to executed with straight-forward pattern matching (with no backtracking). The same thing applies to basic coding questions - it reasonably performs at the level of a human gluing stack overflow solutions together (with the obvious variable renaming/moving lines around/removing dead code/etc.)

It struggles at logical reasoning (when it can't "pattern match" the logical reasoning to something it's trained on).

Coding example:

Had no problem writing a tax calculator for ordinary income with progressive tax brackets
It struggles to write a program to calculate tax on long term capital gains (US tax code), which is very similar to the above, except has an offset (you start bracket indexing at ordinary income). I'd think this is actually pretty easy for a CS student especially if they saw the solution above -- GPT4 struggled though as it doesn't really "reason" about code the way a human would and would generate solutions obviously wrong to a human.

ReasonablyBadass t1_je0rwr3 wrote on March 28, 2023 at 4:16 PM

Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Is it possible the older questions are by now about better known problems so more training data existed for them and the newer are about newer concepts, not really represented on the net yet?

RubenC35 t1_je0ra56 wrote on March 28, 2023 at 4:12 PM

Reply to comment by rfxap in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Would they be a little bias? I mean Microsoft has spent loads of money in the idea of being the best.

antonivs t1_je0pfza wrote on March 28, 2023 at 4:00 PM

Reply to comment by happycube in [D] FOMO on the rapid pace of LLMs by 00001746

Thanks! I actually don't know exactly what this guy used, I'll have to check.

antonivs t1_je0pb85 wrote on March 28, 2023 at 4:00 PM

Reply to comment by Qpylon in [D] FOMO on the rapid pace of LLMs by 00001746

Our product involves a domain-specific language, which customers typically interface to via a web UI, to control the behavior of execution. The first model this guy trained involved generating that DSL so customers could enter a natural language request and avoid having to go through a multi-step GUI flow.

They've tried using it for docs too, that worked well.

dancingnightly t1_je0o082 wrote on March 28, 2023 at 3:51 PM

Reply to comment by antonivs in [D] FOMO on the rapid pace of LLMs by 00001746

The benefit of finetuning or training your own text model (e.g. in the olden days on BERT), now through the OpenAI API vs the benefit of just using contextual semantic search is reducing day-by-day... especially with the extended context window of GPT-4.

If you want something in house, finetuning GPT-J or so could be the way to go, but it's definitely not the career direction I'd take.

Exodia141 t1_je0nc53 wrote on March 28, 2023 at 3:47 PM

Reply to [D] Can DeepL learn from edits to the translations it produces immediately? by petrastales

I believe it remembers the context of the conversation. Try the second translation in a different chat context. It should fail.

subhash165 t1_je0mvvw wrote on March 28, 2023 at 3:44 PM

Reply to comment by OrionJr in [R] Build and personalize LLMs on your own data - Take back control with xTuring! by x_ml

I was able to run it with WSL (with miniconda environment)

JealousAd8448 t1_je0mp1z wrote on March 28, 2023 at 3:43 PM

Reply to comment by OrionJr in [R] Build and personalize LLMs on your own data - Take back control with xTuring! by x_ml

Unfortunately deepspeed is not easy to install on windows. Just use wsl, it did not give any problem to me using conda with python 3.8

x_ml OP t1_je0mp14 wrote on March 28, 2023 at 3:43 PM

Reply to comment by OrionJr in [R] Build and personalize LLMs on your own data - Take back control with xTuring! by x_ml

Deepspeed doesn't work on Windows yet but we were able to install in WSL. My colleague installed deepspeed in conda and then installed our package and it seemed to work.

ianitic t1_je0mjqx wrote on March 28, 2023 at 3:42 PM

Reply to comment by cegras in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Oh I haven't tested this on textbooks, but I have asked chatGPT to give me pages of a novel and it did word for word. I suspect it had to have trained on PDFs? I'm highly surprised I haven't seen any news of authors/publishers suing yet tbh.

It is obvious when a book is a part of its training set or not though based on the above test.

AsliReddington t1_je0mg2b wrote on March 28, 2023 at 3:41 PM

Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

It's a smarter talking parrot is all.

st8ic t1_je0lxgn wrote on March 28, 2023 at 3:38 PM

Reply to comment by truchisoft in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

"bro it's great trust me" isn't exactly a scientific way to think about these issues.

[deleted] t1_je0jwlg wrote on March 28, 2023 at 3:25 PM

Reply to Variance in reported results on ImageNet between papers [D] by kaphed

[deleted]

lqstuart t1_je0jvt7 wrote on March 28, 2023 at 3:25 PM

Reply to comment by keepthepace in [D] FOMO on the rapid pace of LLMs by 00001746

100%, I think the US really REALLY needs to figure out a universal basic income soon, and they aren't going to do it and life is going to suck

cegras t1_je0jsud wrote on March 28, 2023 at 3:25 PM

Reply to comment by MrFlamingQueen in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Do you know if ChatGPT was allowed to ingest PDFs found on the internet? Even if not, I'm sure there are many sections of famous textbooks reproduced in HTML or parsable form.

MrFlamingQueen t1_je0j29h wrote on March 28, 2023 at 3:20 PM

Reply to comment by cegras in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

It feels like majority of the people in this discussion have no idea what computer science is and what LeetCode tests.

As you mentioned, there are hundreds of websites devoted to teaching the leetcode design patterns and entire books devoted to learning and practicing these problems.

DrinkHumblyDumbly t1_je0ifon wrote on March 28, 2023 at 3:16 PM

Reply to [D] Simple Questions Thread by AutoModerator

What type of data drift that describes the changes to future data partially due to the deployment of ML models?

Recent comments in /f/MachineLearning