Recent comments in /f/MachineLearning
visarga t1_jdtypz6 wrote
Reply to comment by Haycart in [D] GPT4 and coding problems by enryu42
Doesn't autoregressive decoding cache the states for the previous tokens when decoding a new token?
AllAmericanBreakfast t1_jdtynpv wrote
Reply to comment by nixed9 in [D] GPT4 and coding problems by enryu42
I tried this out, and it only had partial success.
First, just dumping in this prompt, then asking a question, resulted in the AI coming up with a laughably simple failed first response, followed by a critique and improvement. It is as if it recognized that the easiest way to "demonstrate improvement" would be to set the bar low by failing utterly on the first attempt.
Then, I tried breaking it up into stages, asking for a response, getting a response, asking for a critique, getting a critique, asking for an improvement, and getting an improvement.
This worked better.
However, when I tried asking for a critique and then an improvement (again in separate stages), it instead started inventing fake problems to solve. I was asking it to implement a case-insensitive longest common substring function, and to return the version of the LCS in the longer of the two strings.
The second-pass critique was that the original (working) code didn't deal with the possibilty that "the longer string may not contain the LCS", which is impossible given the way it was originally implemented. Then it added some extra code to deal with this "problem."
argusromblei t1_jdtyi2y wrote
Reply to comment by Ok-Wrangler-1075 in [D] GPT4 and coding problems by enryu42
The center of the maze. A journey inward not a journey upward ;)
visarga t1_jdtyd0c wrote
Reply to comment by trajo123 in [D] GPT4 and coding problems by enryu42
> Perhaps get augmented with some sort of LSTM architecture where state can be built up from a theoretically infinite amount of input
That would be sweet, infinite input. Does RWKV do it?
Anonymous_Penguin1 t1_jdty4ed wrote
Reply to comment by [deleted] in [D] ICML 2023 Reviewer-Author Discussion by zy415
It seems that you can still make an official comment now on OpenReview :) But not pretty sure if that would violate some rule.
HatsusenoRin t1_jdty1by wrote
Reply to comment by ryanjkelly2 in [P] SimpleAI : A self-hosted alternative to OpenAI API by lhenault
Thanks. However I don't think it's made specifically for this API and it forces me to open new account to use it (usually the app goes to my trash bin as soon as I realize it).
I'm surprised that there's no open-source frontend for testing this popular API.
visarga t1_jdtxxfd wrote
Reply to comment by blose1 in [D] GPT4 and coding problems by enryu42
You're mistaken, Olympiad problems require bespoke tricks that don't generalise from problem to problem. It's not a problem of breadth of knowledge, they don't test memorisation.
modeless t1_jdtx2eu wrote
Reply to comment by LanchestersLaw in [D] GPT4 and coding problems by enryu42
I like the idea of predicting the user's response. How's this as an architecture for a helpful agent:
Given a user question, before you generate an answer you predict the user's ideal response to the model's answer (e.g. "thanks, that was helpful", or more likely a distribution over such responses), then generate an answer and iteratively optimize it to make the ideal user response more likely.
This way you're explicitly modeling the user's intent, and you can adapt the amount of computation appropriately for the complexity of the question by controlling the number of iterations on the answer.
visarga t1_jdtwr3g wrote
Reply to comment by yaosio in [D] GPT4 and coding problems by enryu42
> I am saying we don't know what consciousness is because we're missing information and we don't know what information we're missing
I take a practical definition - without it we can't even find the mouth with the hand to eat.
he_who_floats_amogus t1_jdtwp8t wrote
Reply to [D] Build a ChatGPT from zero by manuelfraile
>open source dataset ... feasible to train with a commercial computer ... decent results
Choose two. Therefore, you can approach this one of three ways:
- Use closed source data (eg. where your starting point is a pre-trained model and you're doing additional fine-tuning training)
- Use millions of dollars of compute resource (a "very good GPU - nvidia etc" does not meet this standard)
- Accept poor results
TehDing t1_jdtwa6k wrote
Reply to [D] GPT4 and coding problems by enryu42
I have not been impressed with LLMs reasoning for solving novel puzzles/ challenges. Ask any model to play Wordle with you. They are not good
yaosio t1_jdtvycq wrote
Reply to comment by sdmat in [D] GPT4 and coding problems by enryu42
It's really neat how fast this stuff has been going. I remember when OpenAI claimed GPT-2 was too dangerous to release, which is amusing now because the output of GPT-2 is so bad. But when I used a demo that would write news articles from a headline I thought it was absolutely amazing. Then I, and most of the public, forgot about it.
Then GPT-3 comes out, and AI Dungeon used it before OpenAI censored it sonhsrd AI Dungeon stopped using it. The output was so much better than GPT-2 that I couldn't believe I liked anything GPT-2 made. I told people this was the real deal, it's perfect and amazing! But it goes off the rails very often, and it doesn't understand how a story should be told so it just does whatever.
Then ChatGPT comes out, which we now know is something like a finetune of GPT-3.5. You can chat, code, and it writes stories. The stories are not well written, but they follow the rules of story telling and don't go off the rails. It wasn't fine tuned on writing stories like AI Dungeon did with GPT-3.
Then Bing Chat comes out, which turned out to be based on GPT-4. It's story writing ability is so much better than ChatGPT. None of that "once upon a time" stuff. The stories still aren't compelling, but way better than before.
I'm interested in knowing what GPT-5 is going to bring. What deficiencies will it fix, and what deficiencies will it have? I'd love to see a model that doesn't try to do everything in a single pass. Like coding, even if you use chain of thought and self reflection GPT-4 will try to write the entire program in one go. Once something is written it can't go back and change it if it turns out to be a bad idea, it is forced to incorporate it. It would be amazing if a model can predict how difficult a task will be and then break it up into manageable pieces rather than trying to do everything at once.
FermiAnyon t1_jdtvv5w wrote
Reply to comment by kduyehj in Have deepfakes become so realistic that they can fool people into thinking they are genuine? [D] by [deleted]
Yeah, I'm not gonna hang my hat on a year. The most interesting and significant part about all this is that nobody seems to disagree with the claim that it's going to happen eventually and I just find that kind of amazing that we're messing with AI and having this conversation at all. I couldn't have imagined anything like this, well, like you said... 15 years ago.
Who knows what'll happen in the next 15
machineko t1_jdtv8jv wrote
Reply to comment by ephemeralentity in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
If you need help, come find us on our discord channel.
Snoo_22479 t1_jdtv31f wrote
Anybody ever think that maybe someone was screwing with this guy? Like when this guy got on his terminal. Some of his coworkers were answering instead. I could see it starting out as a joke. And spiraling out of control. Like everybody wanted in on it.
Then once corporate found out they decided to keep it a secret. Because this guy was doing some serious free advertising for Google.
[deleted] t1_jdtutpe wrote
Reply to comment by artsybashev in [D] GPT4 and coding problems by enryu42
[removed]
COMPEWTER_adminisp t1_jdtuqar wrote
Reply to comment by addition in [D] GPT4 and coding problems by enryu42
you don't think people at openAi already have this and they are just putting out there the simple version?
COMPEWTER_adminisp t1_jdtugix wrote
Reply to comment by E_Snap in [D] GPT4 and coding problems by enryu42
> Once these things can continuously take input and output, we’ll probably see quite the rush of advancement.
interesting !
[deleted] t1_jdtubol wrote
Reply to comment by Hamoodzstyle in [D] GPT4 and coding problems by enryu42
[removed]
Flag_Red t1_jdtskoy wrote
Reply to comment by LanchestersLaw in [D] GPT4 and coding problems by enryu42
It's not really accurate to say it's "only considering one token at a time". Foresight and (implicit) planning are taking place. You can see this clearly during programming tasks, where imports come hundreds of tokens before they are eventually used.
jasondads1 t1_jdtrp9l wrote
Reply to comment by kduyehj in Have deepfakes become so realistic that they can fool people into thinking they are genuine? [D] by [deleted]
With the speed that ai is improving, it would be probably less than 1 yeat
Nowado t1_jdtr40r wrote
Reply to comment by CobaltAlchemist in [D] GPT4 and coding problems by enryu42
I do the same thing I'd do with a human: ask it to repeat and rephrase instructions. After that I'm sure and it has multiple forms of instruction available to get less hanged up on some exact wording.
red75prime t1_jdtqsmj wrote
Reply to comment by Narootomoe in [D] GPT4 and coding problems by enryu42
Does GPT-4 have instant recall of all of its training data? I doubt it. It probably has some emergent structures akin to episodic memory, but it seems to have trouble distinguishing its memories from its hallucinations, so it's not a fully functional episodic memory (it lacks metamemory or something like that).
AngusDHelloWorld t1_jdtq232 wrote
Reply to comment by rya794 in [P] Using ChatGPT plugins with LLaMA by balthierwings
And not everyone care about open source. At least for the non technical people, as long as they can get things done, it’s good enough for them.
rokuyou t1_jdtyqe7 wrote
Reply to [D] GPT4 and coding problems by enryu42
GPT4 and competitive programming problems would be a better title since not everyone is going to read that