Recent comments in /f/MachineLearning
was_der_Fall_ist t1_je3ng6m wrote
Reply to comment by bartvanh in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
Maybe that’s part of the benefit of using looped internal monologue/action systems. By having them iteratively store thoughts and otherwise in their context window, they no longer have to use the weights of the neural network to “re-think” every thought each time they predict a token. They could think more effectively by using their computation to do other operations that take the internal thoughts and actions as their basis.
[deleted] t1_je3nb9g wrote
ILOVETOCONBANDITS t1_je3m9ts wrote
Reply to comment by OkWrongdoer4091 in [D] ICML 2023 Reviewer-Author Discussion by zy415
I had a total of 4 revieiwers. Went from (6, 6, 4, 3) to (6, 6, 5, 6). The first two reviewers didn't respond at all.
RandomScriptingQs t1_je3lv1g wrote
Reply to [D] Simple Questions Thread by AutoModerator
Is anyone able to contrast MIT's 6.034 "Artificial Intelligence, Fall 2010" versus 18.065 "Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018"?
I'm wanting to use the one that lies slightly closer to the more theoretical/foundational side as supplementary study and have really enjoyed listening to both Instructors in the past.
martianunlimited t1_je3lmsp wrote
Reply to [D] Prediction time! Lets update those Bayesian priors! How long until human-level AGI? by LanchestersLaw
Relevant publication: https://cdn.openai.com/papers/gpt-4.pdf
I can take comfort in knowing that while GPT-4 is 10-percentile better than me in GRE Verbal, I still score (slightly) better than GPT-4 in GRE Quantitative and very similarly in GRE-Writing. (English is not my first language)
Side note: I am surprised how poorly GPT-4 do in AP English Language and AP English Lit; I thought as a large language model, it would have an advantage in those sort of questions. (Sorry, not an American, i could be misunderstanding what exactly is being tested in those subjects)
MrFlamingQueen t1_je3kywp wrote
Reply to comment by TheEdes in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
Agreed. It's very likely contamination. Even "new" LeetCode problems existed before they were published on the website.
[deleted] t1_je3krtb wrote
petrastales OP t1_je3iv6k wrote
Reply to comment by Exodia141 in [D] Can DeepL learn from edits to the translations it produces immediately? by petrastales
It did fail 😅
shiuidu t1_je3iozf wrote
Reply to comment by GirlScoutCookieGrow in [D] Simple Questions Thread by AutoModerator
I'm not too sure either, I don't know enough about how APIs are connected to LLMs. Do you know what I should search for implementing the API so it can control the program?
geekfolk t1_je3io3b wrote
Reply to comment by Beautiful-Gur-9456 in [P] Consistency: Diffusion in a Single Forward Pass 🚀 by Beautiful-Gur-9456
using pretrained models is kind of cheating, some GANs use this trick too (projected GANs). But as a standalone model, it does not seem to work as well as SOTA GANs (judged by the numbers in the paper)
​
>Still, it's a lot easier than trying to solve any kind of minimax problem.
This is true for GANs in the early days; however, modern GANs are proved to not have mode collapse and the training is proved to converge.
>It's actually reminiscent of GANs since it uses pre-trained networks
I assume you mean distilling a diffusion model in the paper. There have been some attempts to combine diffusion and GANs to get the best of both worlds but afaik none involved distillation, I'm curious if anyone has tried distilling diffusion models into GANs.
Beautiful-Gur-9456 OP t1_je3hxbn wrote
Reply to comment by geekfolk in [P] Consistency: Diffusion in a Single Forward Pass 🚀 by Beautiful-Gur-9456
The training pipeline, honestly, is significantly simpler without adversarial training, so the design space is much smaller.
It's actually reminiscent of GANs since it uses pre-trained networks as a loss function to improve the quality, though it's completely optional. Still, it's a lot easier than trying to solve any kind of minimax problem.
uiucecethrowaway999 t1_je3hpyz wrote
Reply to [D] Prediction time! Lets update those Bayesian priors! How long until human-level AGI? by LanchestersLaw
>Making the bold an unscientific assumption that this sub is at least decently representative of people “in the know” on ML,.
The increasing number of posts like this indicate that it may no longer be the case.
I’m not trying to be snarky or mean when I say this, but these sorts of posts offer pretty much zero insight or discussion value. There are a lot of very knowledgeable minds on this subreddit, but you won’t be able to get much out of it by asking such vague and sweeping questions.
jer_pint t1_je3ffzz wrote
Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
on a sort of related note, I tested gpt4's ability to play wordle, and it was pretty bad. I think it has to do with the fact that wordle only existed after gpt cutoff: https://www.jerpint.io/blog/gpt-wordle/
salgat t1_je3eqx5 wrote
Reply to comment by rfxap in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
GPT4 is the world's best googler. As long as a similar solution existed on the internet in the past, there's a good chance GPT4 can pick it up, even if it's not on leetcode yet.
Eaklony t1_je3dqa5 wrote
I am doing the same thing as you. I am currently playing with gpt2 since it’s extremely small. Then when I am comfortable I plan to play with gptj or other ~7b models. Then finally I kinda want to try something with a 20b model as a final big project maybe since I saw you can fine tune it on 4090.
DreamWithinAMatrix t1_je3c6kl wrote
Reply to comment by currentscurrents in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
There was that time Google was taken to court for scanning and indexing books for Google Books or whatever and Google won:
https://en.wikipedia.org/wiki/Authors_Guild,_Inc._v._Google,_Inc.
Eaklony t1_je3btl6 wrote
Reply to [D] Prediction time! Lets update those Bayesian priors! How long until human-level AGI? by LanchestersLaw
I vote for 2030-2040 if you put an AI into my body nobody who know me would notice any difference. I.e. he can fake me perfectly. But 2030 I am sure many people will start to at least believe there is already agi.
[deleted] t1_je3ah6y wrote
Reply to comment by geekfolk in [P] Consistency: Diffusion in a Single Forward Pass 🚀 by Beautiful-Gur-9456
[deleted]
ghostfaceschiller t1_je3abdo wrote
Reply to comment by -xXpurplypunkXx- in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
Really? I def had that some with 3.5 but 4 has been v good. Not perfect obviously
Beli_Mawrr t1_je394op wrote
Reply to comment by HatsusenoRin in [P] SimpleAI : A self-hosted alternative to OpenAI API by lhenault
Try Jquery AJAX. that works on the frontend. $.post, $.get, etc.
Coffee_Crisis t1_je392lv wrote
Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
If you search GitHub for unusual variable names or keywords you will often find code that looks very similar to the stuff GPT spits out, in some domains it’s much more copy paste than people think
[deleted] OP t1_je36gqo wrote
Reply to [D] I've got a Job offer but I'm scared by [deleted]
[deleted]
_sbmaruf t1_je369s5 wrote
Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
Sorry for self posting my work here. But you can take a look at our recent work, https://arxiv.org/abs/2303.03004
aidenr t1_je357dx wrote
Reply to [D] I've got a Job offer but I'm scared by [deleted]
Take the Coursera ML class. You’re going to need applied Linear Algebra eventually and that’s easy enough to learn from that course.
hebweb t1_je3ofr1 wrote
Reply to [P] Consistency: Diffusion in a Single Forward Pass 🚀 by Beautiful-Gur-9456
Cool! this is amazing. You already created a pip package out of it. Have you measured the fid on your model? Does it match the numbers of the paper? I think their batch size and model size were pretty large even for the CIFAR10 training. Not sure if we can match that..