notforrob t1_je1lowh wrote on March 28, 2023 at 7:22 PM

Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

This inspired me to ask GPT-4:
"Can you generate a leetcode easy problem that has never been seen?"

And then ask it to solve the problem it creates. In the few cases I tried it failed miserably.

truchisoft t1_je1huuz wrote on March 28, 2023 at 6:58 PM

Reply to comment by visarga in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Point taken, this article tho is also filled with holes tho.

fiftyfourseventeen t1_je1gprd wrote on March 28, 2023 at 6:51 PM

Reply to comment by utopiah in [D] FOMO on the rapid pace of LLMs by 00001746

If you just want to change the output of a model to look more like something else in its training data, sure. LoRa trains the attention layers (technically it trains a separate model but it can be merged into the attention layers), so it doesn't necessarily add anything NEW per se, but rather focuses on things the model has already learned. For example, if you were to try to make a model work well with a language not in its training data, LoRa is not going to work very well. However, if you wanted to make the model give things in a dialogue like situation (as is the case of alpaca), it can work because the model has already seen dialogue before, so the LoRa makes it "focus" on creating dialogue.

You can get useful results with just LoRa, which is nice. If you want to try to experiment with architecture improvements or large scale finetunes / training from scratch, you are out of luck unless you have millions of dollars.

I'd say the biggest limitation of LoRa is that your model for the most part already has to "know" everything that you are trying to do. It's not a good solution to add more information into the model (e.g. training it on information after 2021 to make it more up to date) with lora. That has to be a full finetune which is a lot more expensive.

As for the cost, I honestly don't know because these companies don't like to make data like that public. We don't even know for sure what hardware GPT 3 was trained on, although it was likely V100s, and then A100s for GPT 3.5 and 4. I think people calculated the least they could have spent on training was around 4.5 million for GPT 3, and 1.6 million for llama. That doesn't even include all the work that went into building an absolutely massive dataset and paying employees to figure out how to do distributed training across tens of thousands of nodes with multiple GPUs each.

All-DayErrDay t1_je1g2d8 wrote on March 28, 2023 at 6:47 PM

Reply to comment by bjj_starter in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Exactly!

regalalgorithm t1_je1eu1e wrote on March 28, 2023 at 6:40 PM

Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

FYI, the GPT 4 paper has a whole section on contamination in the appendix - I found it to be pretty convince. Removing contaminatimg data did make it worse at some benchmarks, but also better at others, and overall it wasn't a huge effect.

antonivs t1_je1d8o0 wrote on March 28, 2023 at 6:30 PM

Reply to comment by dancingnightly in [D] FOMO on the rapid pace of LLMs by 00001746

The training corpus size here is in the multi-TB range, so probably isn't going to work with the OpenAI API currently, from what I understand.

antonivs t1_je1cuw1 wrote on March 28, 2023 at 6:28 PM

Reply to comment by Craksy in [D] FOMO on the rapid pace of LLMs by 00001746

My description may have been misleading. They did the pretraining in this case. The training corpus wasn't natural language, it was a large set of executable definitions written in a company DSL, created by customers via a web UI.

currentscurrents t1_je1cjaq wrote on March 28, 2023 at 6:26 PM

Reply to "[D]" Is wandb.ai worth using? by frodo_mavinchotil

Wandb can be run locally. There is also tensorboard.

Quintium t1_je1c17v wrote on March 28, 2023 at 6:23 PM

Reply to comment by i_am__not_a_robot in My ChatGPT Chrome Extension that saves conversations in .md files is finally approved by the Chrome Web Store. It's still and so will continue to be Open Source. [P] by ThePogromist

He did answer with the same thing I did, although in a somewhat confusing way

CommunismDoesntWork t1_je1bklp wrote on March 28, 2023 at 6:20 PM

Reply to [D] FOMO on the rapid pace of LLMs by 00001746

Maybe figure out how to train an LLM with far less data and much faster?

currentscurrents t1_je1ai1i wrote on March 28, 2023 at 6:13 PM

Reply to comment by nixed9 in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

I asked it for a parody and got something similar to, but different from Weird Al's song: https://pastebin.com/FKrZiEi9

When I asked it to be original I got quite different lyrics: https://pastebin.com/uwpqAnyz

Here's the actual lyrics for reference. This reminds me of how you can get LLMs to be less toxic/biased just by telling them to treat people fairly.

mcilrain t1_je1a7cl wrote on March 28, 2023 at 6:11 PM

Reply to comment by currentscurrents in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Current tech could be used to allow you to ask an AI assistant to read you a book.

mcilrain t1_je19vif wrote on March 28, 2023 at 6:09 PM

Reply to comment by cegras in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Even if it didn't ingest PDFs it probably ingested websites that scraped PDFs to spam search engine results.

Beautiful-Gur-9456 OP t1_je18sc5 wrote on March 28, 2023 at 6:02 PM

Reply to comment by noraizon in [P] Consistency: Diffusion in a Single Forward Pass 🚀 by Beautiful-Gur-9456

You're totally right 😅 I think the true novelty here is dropping distillation and introducing a BYoL-like simple formulation. Bootstrapping always feels like magic to me.

currentscurrents t1_je17y5v wrote on March 28, 2023 at 5:57 PM

Reply to comment by hardmaru in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

>Why are deep learning technologists so overconfident

>A Narayanan, S Kapoor

>Substack newsletter. AI Snake Oil

You can get your blogposts listed on Google Scholar?

i_am__not_a_robot t1_je17nwu wrote on March 28, 2023 at 5:55 PM

Reply to comment by Quintium in My ChatGPT Chrome Extension that saves conversations in .md files is finally approved by the Chrome Web Store. It's still and so will continue to be Open Source. [P] by ThePogromist

I'd like to hear OP's take before I try his extension.

Historical-Tree9132 t1_je17bln wrote on March 28, 2023 at 5:53 PM

Reply to comment by wazis in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

0/24 0/12 on code problems it never seen before really surprised me

RossoMarra t1_je16mod wrote on March 28, 2023 at 5:49 PM

Reply to comment by hadaev in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

I really think you are underestimating biologists.

sebzim4500 t1_je16h58 wrote on March 28, 2023 at 5:48 PM

Reply to comment by gunbladezero in [P] two copies of gpt-3.5 (one playing as the oracle, and another as the guesser) performs poorly on the game of 20 Questions (68/1823). by evanthebouncy

I'm going to simplify a bit here, if you want a more complete answer I can write something up. I was planning on writing a blog post about this, because it is relevant to why ChatGPT does so much better when asked to show its working.

Basically, LLMs do not have any memory except what you see in the output. You may think that the network just needs to decode the base64 once and then use it to answer all the questions, but in actuality it needs to do it for every single token.

This is compounded by the fact that decoding base64 like this is a per-character operation, which GPT-n is especially bad at due to their choice of tokens. Since it only can use a finite amount of computation per token, wasting computation in this way will decrease the effectiveness.

Here's an example where simply making GPT-4 reverse the string makes it completely unable to do a straightforward calculation, unless you let it show its working.

Quintium t1_je16gpd wrote on March 28, 2023 at 5:48 PM

Reply to comment by i_am__not_a_robot in My ChatGPT Chrome Extension that saves conversations in .md files is finally approved by the Chrome Web Store. It's still and so will continue to be Open Source. [P] by ThePogromist

I'm assuming it's because pogromist is similar to programist, which is Russian for programmer. Also, pogrom is not always used in the context of massacring ethnic groups, and stands for riots and chaos in general

CatalyzeX_code_bot t1_je161ww wrote on March 28, 2023 at 5:45 PM

Reply to comment by SFDeltas in Variance in reported results on ImageNet between papers [D] by kaphed

fixing :) sorry about that

currentscurrents t1_je15i85 wrote on March 28, 2023 at 5:42 PM

Reply to comment by krali_ in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

That's still on a waitlist unfortunately.

GPT-4 is good but slow, at least for now I mostly still use the GPT-3.5 model.

[deleted] t1_je15c4c wrote on March 28, 2023 at 5:41 PM

Reply to comment by bjj_starter in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

[deleted]

was_der_Fall_ist t1_je15397 wrote on March 28, 2023 at 5:39 PM

Reply to comment by ntaylor- in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9

You don’t think the neural network, going through hundreds of billions of parameters each time it calculates the next token, is doing anything complicated?

marr75 t1_je14tki wrote on March 28, 2023 at 5:38 PM

Reply to comment by wazis in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

me irl

Recent comments in /f/MachineLearning