[deleted] t1_jdzh4h0 wrote on March 28, 2023 at 9:47 AM

Reply to comment by hadaev in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

[deleted]

sb1729 t1_jdzgfff wrote on March 28, 2023 at 9:37 AM

Reply to comment by Simcurious in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

They mention that in the article.

mrpickleby t1_jdzg5e8 wrote on March 28, 2023 at 9:33 AM

Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Implies that AI will speed the dissemination of information but not necessarily be helpful in creating new thinking.

rfxap t1_jdzfxd1 wrote on March 28, 2023 at 9:29 AM

Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

There are other benchmarks to look at though. Microsoft Research tried an early version of GPT-4 on LeetCode problems that were published after the training data cutoff date, and they got results similar to human performance in all difficulty categories: https://arxiv.org/abs/2303.12712 (page 21)

What should we make of that?

Janderhungrige t1_jdzetzd wrote on March 28, 2023 at 9:13 AM

Reply to comment by VectorSpaceModel in [D] Is French the most widely used language in ML circles after English? If not, what are some useful (natural) languages in the field of machine learning? by Subject_Ad_9680

Yes, but an interesting fact is that the score of chatGPT4 is higher for French than for Mandarin (83.6% vs 80.1%). With English at the top (85.5%).

Janderhungrige t1_jdzel3h wrote on March 28, 2023 at 9:09 AM

Reply to [D] Is French the most widely used language in ML circles after English? If not, what are some useful (natural) languages in the field of machine learning? by Subject_Ad_9680

Looking at chatGPT4 scores ... laughing in German while looking at French (83.7% vs 83.6%)

SnooMarzipans3021 t1_jdzehws wrote on March 28, 2023 at 9:08 AM

Reply to [D] Simple Questions Thread by AutoModerator

Hello, does anyone have suggestions on how to do guided image upsacling?
Basically I have 6000x6000 image which im unable to load in network because of GPU memory. I had this idea of resizing the image to something like 1500x1500 and then upscaling it back to 6000x6000. But I have to do it without losing details and dont want to use super resolution models (im ofraid they will hallucinate and inpaint). If I already have the ground truth resolution, how can I use it to guide the upscaling?

VertexMachine t1_jdzehvy wrote on March 28, 2023 at 9:08 AM

Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Interesting. Potentially something that might be also used in the ongoing lawsuit against copilot?

petrastales OP t1_jdzefzu wrote on March 28, 2023 at 9:07 AM

Reply to comment by AuspiciousApple in [D] Can DeepL learn from edits to the translations it produces immediately? by petrastales

I believe I understand what you said - so it learnt in the context of my input, but that wouldn’t translate to learning and applying that knowledge to the translations of other users?

petrastales OP t1_jdzed2e wrote on March 28, 2023 at 9:06 AM

Reply to comment by r_linux_mod_isahoe in [D] Can DeepL learn from edits to the translations it produces immediately? by petrastales

What does that mean?

master3243 t1_jdzec5r wrote on March 28, 2023 at 9:06 AM

Reply to comment by hadaev in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Seeing how they made sure the bar exam and the math olympiad tests were recent ones that were explicitly stated to not be in the training dataset to avoid contamination, I trusted that all the other reported tests were also as carefully picked to avoid contamination.

ghostfaceschiller t1_jdzduen wrote on March 28, 2023 at 8:58 AM

Reply to comment by Riboflavius in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Looolllll

FinancialElephant t1_jdzdm7x wrote on March 28, 2023 at 8:55 AM

Reply to [D] FOMO on the rapid pace of LLMs by 00001746

I was way more impressed by mu zero when it came out. I feel crazy for not being that impressed by these LLMs. I do think they are changing the world, but I don't see this as some huge advancement in ML as much as an advanced ingestion and regurgitation machine. All the "intelligence" is downstream from the humans that generated the data.

Honestly I think the reason it made a huge splash is because the RLHF fine tuning made the models especially good at fooling humans. It feels like more of a hack than a big step in AI. My biggest worry is people will expect too much out of these things, too soon. There seems to be a lot of fear and exuberance going around.

mrdevlar t1_jdzdi2t wrote on March 28, 2023 at 8:53 AM

Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Proof that no matter where you go, it is always going to be possible to make simple mistakes.

sebzim4500 t1_jdzczty wrote on March 28, 2023 at 8:46 AM

Reply to comment by gunbladezero in [P] two copies of gpt-3.5 (one playing as the oracle, and another as the guesser) performs poorly on the game of 20 Questions (68/1823). by evanthebouncy

I think you're forcing the model to waste the lower layers on each step decoding that base64 string. Let it output the word normally, and you would probably see much better performance. Just don't look at the first output, if you want to still play it like a game.

hadaev t1_jdzcowi wrote on March 28, 2023 at 8:41 AM

Reply to comment by Seankala in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Well we usually expect it from not really ds peoples like biologists using ds methods and making such a trivial mistake.

It doesnt seems hard to search matches in text. Unlike other data types.

utopiah t1_jdzcevv wrote on March 28, 2023 at 8:37 AM

Reply to comment by fiftyfourseventeen in [D] FOMO on the rapid pace of LLMs by 00001746

$500 https://github.com/tatsu-lab/stanford_alpaca#data-generation-process

Disastrous_Elk_6375 t1_jdzccfq wrote on March 28, 2023 at 8:36 AM

Reply to [P] ChatGPT Survey: Performance on NLP datasets by matus_pikuliak

Is 0shot really the strength of GPT and especially chatGPT? From my (limited) experience interacting with chatGPT, the value seems to come from prompt understanding and adaptation to my next prompts / corrections. In the context of an assistant, I'm ok with priming the conversation first, if it can handle the subsequent requests better.

andreichiffa t1_jdzcbln wrote on March 28, 2023 at 8:35 AM

Reply to [D] Small language model suitable for personal-scale pre-training research? by kkimdev

Depends on which hardware you have. A rule of thumb is that if you want to be efficient, you need about 3x the model size in VRAM to store optimizers state, plus some headroom for data.

You also need to use float for training, due to stability issues. So unless your GPU supports float8, double the RAM.

Realistically, if you have an RTX 4090, you can go up to 6-7B models (Bloom-6B, GPT-j, …). Anything below, and I would aim at 2.7B models (GPT-neo).

I would avoid LLaMA family due to how you get access to pretrained model weights, for liability, and stay with FOSS. In the latter case you can contribute back and gain some visibility this way, assuming you want some.

shiuidu t1_jdzc9o6 wrote on March 28, 2023 at 8:35 AM

Reply to [D] Simple Questions Thread by AutoModerator

I have a project I want to build a natural language interface to. Is there a simple way to do this? It's a .net project but I have a python project I want to do the same thing for?

asdfzzz2 t1_jdzbuav wrote on March 28, 2023 at 8:28 AM

Reply to [D] Small language model suitable for personal-scale pre-training research? by kkimdev

https://arxiv.org/abs/2212.14034 might be a good starting point.

killver t1_jdzbsaz wrote on March 28, 2023 at 8:27 AM

Reply to [P] ChatGPT Survey: Performance on NLP datasets by matus_pikuliak

I think the tricky thing about actually validating zero-shot capabilities is again a question of in-sample vs. out-of-sample. Which of these samples has ChatGPT actually already seen?

Craksy t1_jdzbgzj wrote on March 28, 2023 at 8:22 AM

Reply to comment by kalakau in [D] FOMO on the rapid pace of LLMs by 00001746

Not at all.
While it doesn't mean the world for the point I was trying to make, it does change the meaning quite a bit.

Thank you for the correction

Riboflavius t1_jdzb56p wrote on March 28, 2023 at 8:17 AM

Reply to comment by ghostfaceschiller in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

I was reading your reply and couldn't help thinking that the italics and then the missing period make it look like the end of it is already red-shifted because we're accelerating so fast.

kalakau t1_jdzb1jx wrote on March 28, 2023 at 8:16 AM

Reply to comment by Craksy in [D] FOMO on the rapid pace of LLMs by 00001746

> Generalized Pretrained Transformer

this is pedantic but it's actually Generative PT

Recent comments in /f/MachineLearning