dudaspl t1_je3zq9d wrote on March 29, 2023 at 6:40 AM

Reply to comment by LeEpicCheeseman in [D] Prediction time! Lets update those Bayesian priors! How long until human-level AGI? by LanchestersLaw

Exactly. LLMs mimic intelligence by just generating text, and since they are trained on civilization-level knowledge/data they do it very well and can seem as intelligent as humans.

The real test is to put them to novel scenarios and see how their intelligence can produce solutions to these, i.e. put it in an some sort of escape room and see if they can escape.

OkWrongdoer4091 t1_je3zc8k wrote on March 29, 2023 at 6:35 AM

Reply to comment by ILOVETOCONBANDITS in [D] ICML 2023 Reviewer-Author Discussion by zy415

Good luck 🤞

ILOVETOCONBANDITS t1_je3yo38 wrote on March 29, 2023 at 6:26 AM

Reply to comment by OkWrongdoer4091 in [D] ICML 2023 Reviewer-Author Discussion by zy415

Yeah i feel your pain. it's a bit weird that even though a 5 this year is technically borderline accept, its basically a reject given the score distributions. And not getting a good rebuttal hurts.

purplebrown_updown t1_je3xwqa wrote on March 29, 2023 at 6:16 AM

Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Question. I’m guessing they want to continuously feed more data to gpt so how do they avoid using up all their training. Is this what’s called data leakage?

MOSFETBJT t1_je3xbo7 wrote on March 29, 2023 at 6:08 AM

Reply to Variance in reported results on ImageNet between papers [D] by kaphed

Probably batch size differences?

SnooMarzipans3021 t1_je3x9ah wrote on March 29, 2023 at 6:07 AM

Reply to comment by GirlScoutCookieGrow in [D] Simple Questions Thread by AutoModerator

Im unable to load full res image into the model and train it even with batch size 1 and all sorts of optimizations. My idea is to add two small modules to my network. One at the front which downscales the image and one at the back which upscales the image.

The problem here is the upscaling, it will need to be some sort of super resolution model.

ILOVETOCONBANDITS t1_je3wltt wrote on March 29, 2023 at 5:59 AM

Reply to comment by OkWrongdoer4091 in [D] ICML 2023 Reviewer-Author Discussion by zy415

haha no worries. I hope with an average of 5.75 there is a decent shot of acceptance

OkWrongdoer4091 t1_je3wd2a wrote on March 29, 2023 at 5:56 AM

Reply to comment by ILOVETOCONBANDITS in [D] ICML 2023 Reviewer-Author Discussion by zy415

Ah. Sorry, I forgot we're talking about this year 😄. So three more agonizing weeks to go.

ILOVETOCONBANDITS t1_je3w70l wrote on March 29, 2023 at 5:54 AM

Reply to comment by OkWrongdoer4091 in [D] ICML 2023 Reviewer-Author Discussion by zy415

Er we'll see soon? I guess decisions come out in april

OkWrongdoer4091 t1_je3w01r wrote on March 29, 2023 at 5:52 AM

Reply to comment by ILOVETOCONBANDITS in [D] ICML 2023 Reviewer-Author Discussion by zy415

Was it accepted in the end?

OkWrongdoer4091 t1_je3vscq wrote on March 29, 2023 at 5:49 AM

Reply to comment by ILOVETOCONBANDITS in [D] ICML 2023 Reviewer-Author Discussion by zy415

My only hope that my metareviewers will be diligent enough to reach out to the reviewers that didn't respond and ask if they'd changed their mind after reading the rebuttals. Not responding to the rebuttals destroys the purpose of rebuttal (and the review itself). If you volunteer to serve the community as a reviewer, why not commit to doing a good job of it? I wish I could contact the ACs, but the buttons are no longer available on OpenReview.

lostmsu t1_je3vmgq wrote on March 29, 2023 at 5:47 AM

Reply to [D] Prediction time! Lets update those Bayesian priors! How long until human-level AGI? by LanchestersLaw

GPT-4 likely surpasses pretty much anyone with IQ under 70.

Can be determined by a Turing test, where the person guessing is of that IQ level.

norcalnatv t1_je3v5yw wrote on March 29, 2023 at 5:42 AM

Reply to Should software developers be concerned of AI? [D] by Chasehud

If they're they're below average they should.

Beautiful-Gur-9456 OP t1_je3uesm wrote on March 29, 2023 at 5:33 AM

Reply to comment by hebweb in [P] Consistency: Diffusion in a Single Forward Pass 🚀 by Beautiful-Gur-9456

I haven't done it yet, but I'm working on it! Their suggested sampling procedure requires multiple FID calculation, so I'm thinking of how to incorparate it efficiently.

Their scale is indeed large, it would cost me a few hundread bucks to train CIFAR10. My checkpoint was trained with much smaller size 😆

Craksy t1_je3tzt3 wrote on March 29, 2023 at 5:28 AM

Reply to comment by antonivs in [D] FOMO on the rapid pace of LLMs by 00001746

Aah, got you. My bad. Well, I suppose most people mainly think of NLP in these kind of contexts. That's where my mind went, anyway.

Training from scratch on a DSL is indeed an entirely different scale of problem (assuming it's not some enormous, complex DSL that relies heavily on context and thousands of years of culture to make sense of).

Sounds very interesting though. If you're allowed to share more information, I'd love to hear about it

StellaAthena t1_je3tz04 wrote on March 29, 2023 at 5:28 AM

Reply to comment by regalalgorithm in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

I found this analysis incredibly unconvincing. They used a weaker standard for deduplication than is standard as well as a weaker analysis than the one they did for the GPT-3 paper.

Beautiful-Gur-9456 OP t1_je3sung wrote on March 29, 2023 at 5:15 AM

Reply to comment by Username912773 in [P] Consistency: Diffusion in a Single Forward Pass 🚀 by Beautiful-Gur-9456

I think the reason lies in the difference in the amount of computation rather than architectural difference. Diffusion models have many chances to correct their predictions, but GANs do not.

[deleted] t1_je3rwbm wrote on March 29, 2023 at 5:04 AM

Reply to comment by i_am__not_a_robot in Should software developers be concerned of AI? [D] by Chasehud

[removed]

AllowFreeSpeech t1_je3rjmv wrote on March 29, 2023 at 5:00 AM

Reply to comment by currentscurrents in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700

20:1 ratio of tokens:params

geekfolk t1_je3qyfr wrote on March 29, 2023 at 4:54 AM

Reply to comment by Username912773 in [P] Consistency: Diffusion in a Single Forward Pass 🚀 by Beautiful-Gur-9456

I don’t know about this model, but GANs are typically smaller than diffusion models in terms of num of params. The image structure thing probably has something to do with the network architecture since GANs rarely use attention blocks and the network architecture of diffusion models is more hybrid (typically CNN + attention)

Beautiful-Gur-9456 OP t1_je3qsdu wrote on March 29, 2023 at 4:52 AM

Reply to comment by geekfolk in [P] Consistency: Diffusion in a Single Forward Pass 🚀 by Beautiful-Gur-9456

Nope. I mean the LPIPS loss, which kinda acts like a discriminator in GANs. We can replace it to MSE without much degradation.

Distilling SOTA diffusion model is obviously cheating 😂, so I didn't even think of it. In my view, they are just apples and oranges. We can augment diffusion models with GANs and vice versa to get the most out of them, but what's the point? That would make things way more complex. It's clear that diffusion models cannot beat SOTA GANs for one-step generation; they've been tailored for that particular task for years. But we're just exploring possibilities, right?

Aside from the complexity, I think it's worth a shot to replace LPIPS loss and adversarially train it as a discriminator. Using pre-trained VGG is cheating anyway. That would be an interesting direction to see!

geekfolk t1_je3qiqw wrote on March 29, 2023 at 4:49 AM

Reply to comment by huehue9812 in [P] Consistency: Diffusion in a Single Forward Pass 🚀 by Beautiful-Gur-9456

R1 is one form of 0-gp, it’s actually introduced in the paper that proposed 0-gp. See my link above

huehue9812 t1_je3q0pl wrote on March 29, 2023 at 4:44 AM

Reply to comment by geekfolk in [P] Consistency: Diffusion in a Single Forward Pass 🚀 by Beautiful-Gur-9456

Hey, can I ask something about 0-GP GANs? This is the first time I've ever heard of them. I was wondering what makes them superior over R1 regularization. Also, why is it that most papers mention R1 reg., but not 0-GP?

Username912773 t1_je3pipz wrote on March 29, 2023 at 4:39 AM

Reply to comment by geekfolk in [P] Consistency: Diffusion in a Single Forward Pass 🚀 by Beautiful-Gur-9456

Aren’t GANs substantially larger and harder to preserve image structure?

LeEpicCheeseman t1_je3p1no wrote on March 29, 2023 at 4:34 AM

Reply to [D] Prediction time! Lets update those Bayesian priors! How long until human-level AGI? by LanchestersLaw

Really depends on how "general" you define AGI to be.

To me, AGI means developing agents that can operate autonomously in the real world and make sensible decisions across a wide range of situations and domains. I don't think we're currently very close to developing these sorts of agents, although it probably isn't more than a couple decades away.

Recent comments in /f/MachineLearning