Recent comments in /f/MachineLearning
stimulatedecho t1_jdzxtb6 wrote
"complex reasoning is perhaps the most interesting feature of these models right now and it is unfortunately mostly absent from this survey"
Bingo. It is also the hardest to quantify; it's one of those "I know it when I see it" sort of behaviors. It is easy to imagine how one might harness that ability to reason to solve all sorts of problems, including (but certainly not limited to) improving benchmark performances. I think that is what has a lot of people excited.
crazyvaclav3 t1_jdzx86v wrote
Reply to comment by rshah4 in [D] FOMO on the rapid pace of LLMs by 00001746
Is the video available? I'd love to see it
jrkirby t1_jdzx1ef wrote
Reply to comment by hadaev in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
I'm guessing the hard part is that you can't "untrain" a model. They hadn't thought "I want to benchmark on these problems later" when they started. Then they spent 20K$+ compute on training. Then they wanted to test it. You can easily find the stuff you want to test on in your training dataset, sure. But you can't so easily remove it and train everything again from scratch.
londons_explorer t1_jdzwcfo wrote
Reply to comment by keepthepace in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
Problems like this are never 100% novel.
There are always elements and concepts of the problem and solution that have been copied from other problems.
The easiest way to see this is to ask a non-programmer to come up with a 'programming puzzle'. They'll probably come up with something like "Make an app to let me know when any of my instagram friends are passing nearby and are up for hanging out".
Compare that to a typical leetcode problem, and you'll soon see how leetcode problems are really only a tiny tiny corner of what is possible to do with computers.
keepthepace t1_jdzvxl2 wrote
Reply to [D] FOMO on the rapid pace of LLMs by 00001746
Maybe I am stubborn but I haven't totally digested the "bitter lesson" and I am not sure I agree in its inevitability. Transformers did not appear magically out of nowhere, they were a solution to RNN's venishing gradient problem. AlphaGo had to be put into a min-max montecarlo search to do anything good, and it is hard to not feel that LLMs grounding issues may be a problem to solve with architecture changes rather than scale.
thecodethinker t1_jdzvin6 wrote
Reply to comment by bjj_starter in [D] GPT4 and coding problems by enryu42
LLMs are not social, not alive, and can’t act on their own.
“Social meaning” need not be applied to LLMs unless you’re trying to be pedantic.
GirlScoutCookieGrow t1_jdzvdoc wrote
Reply to comment by MammothJust4541 in [D] Simple Questions Thread by AutoModerator
Google "style transfer", there are a ton of models which do this.
keepthepace t1_jdzvbww wrote
Reply to comment by ObiWanCanShowMe in [D] FOMO on the rapid pace of LLMs by 00001746
> You won't get replaced by AI, you will get replaced by someone who knows how to use the AI.
I wonder why this is any comfort. This is just a rephrasing of "your skillset is obsolete, your profession that used to pay you a salary is now worth a 15 USD/month subscription service"
The person "who knows how to use AI" is not necessarily a skilled Ai specialist. It could simply be your typical client.
The current AI wave should be the trigger to reconsider the place we give to work in our lives. Many works are being automated and no, this is not like the previous industrialization waves.
Workers used to be replaced by expensive machines. It took time to install things, prepare the infrastructure for the transition, it required other workers to do maintenance.
This wave replaces people instantly with an online service that requires zero infrastructure (for the user), costs a fraction of a wage and gives almost instant results.
Yes, progress that suppress jobs tend to create new jobs as well, but there is no mechanism through which there is any guarantee of symmetry between these two quantities and when you think about the AI wave, it is clear that the jobs will be removed faster than they are created and that the skillsets from the jobs removed do not translate well to the hypothetical jobs created.
GirlScoutCookieGrow t1_jdzvasi wrote
Reply to comment by shiuidu in [D] Simple Questions Thread by AutoModerator
OpenAI API? It's not clear exactly what you need
GirlScoutCookieGrow t1_jdzv85v wrote
Reply to comment by SnooMarzipans3021 in [D] Simple Questions Thread by AutoModerator
I'm not sure I understand what you hope to accomplish. If you have the full size image, why do you want to downscale and upscale? This won't help you fit the full image on the GPU
thelastpizzaslice t1_jdzv7pu wrote
Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
I once asked it for a parody of Miss American Pie about Star Wars Episode 1 and it gave me Weird Al's song verbatim.
[deleted] t1_jdzuwoq wrote
Reply to comment by utopiah in [D] FOMO on the rapid pace of LLMs by 00001746
[deleted]
milktoasttraitor t1_jdzuw0z wrote
Reply to comment by rfxap in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
If you look at the prompt they show, they clearly gave it hints which tell it the exact approach to use in order to solve the problem. The problem is also a very slight derivative of another existing, very popular problem on the platform (“Unique Paths”).
This is impressive in another way, but not in the way they were trying to show. They didn’t show the other questions it got right, so no way of telling how good or bad the methodology was overall or what hints they gave it. For that question at least, it’s not good and it makes me skeptical of the results.
MotionTwelveBeeSix t1_jdzurlg wrote
Reply to comment by master3243 in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
The bar exams recycle the same questions every year, there’s very little original about them. Its a test of pure memorization
CyberDainz t1_jdzu9e4 wrote
looks similar to "Cold Diffusion"
visarga t1_jdzu6az wrote
Reply to comment by spiritus_dei in [D] FOMO on the rapid pace of LLMs by 00001746
Let the critics critique, it's better to have an adversarial take for everything, when you take a survey you get better calibration that way.
He's angry for the forced Gallactica retraction, followed by chatGPT success. Both models had hallucination issues but his model was not tolerated well by the public.
[deleted] t1_jdztsmd wrote
Reply to comment by Seankala in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
[removed]
visarga t1_jdztq3o wrote
Reply to comment by CriticalTemperature1 in [D] FOMO on the rapid pace of LLMs by 00001746
In short, build around LLMs and with LLMs, but don't compete directly with them.
visarga t1_jdzt9gd wrote
Reply to comment by Craksy in [D] FOMO on the rapid pace of LLMs by 00001746
> Generalized 😓
dimem16 t1_jdzsrhh wrote
Reply to comment by rshah4 in [D] FOMO on the rapid pace of LLMs by 00001746
Thanks:)
is_it_fun t1_jdzs7dw wrote
Reply to comment by hadaev in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
Biologists are such trash nowadays when it comes to any kind of computational / math methods. Back in our grandfather's days they were really hardcore.
abc220022 t1_jdzrsbu wrote
Reply to comment by rfxap in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
Part of the sales pitch behind LeetCode is that you are working on problems that are used in real coding interviews at tech companies. I believe that most LeetCode problems were invented well before they were published on the LeetCode website, so they still could appear in some form in their training data.
visarga t1_jdzr4tp wrote
Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
This paper scared me more than any other ML paper. I hoped we have 2-3 more years until what they show in there.
happycube t1_jdzq0v4 wrote
Reply to comment by antonivs in [D] FOMO on the rapid pace of LLMs by 00001746
nanoGPT's good for this sort of from-scratch training, there's an updated version of the classic char-RNN Shakespeare model in the repo.
bjj_starter t1_jdzymdg wrote
Reply to comment by thecodethinker in [D] GPT4 and coding problems by enryu42
>not social
"needing companionship and therefore best suited to living in communities" is a fine descriptor of some of their peculiarities. More importantly, I was referring to how consciousness is socially defined, and it is absolutely the case that it is up to us to determine whether any given AI should be considered conscious. We do not have an even moderately objective test. We as a society should build one and agree to abide by what we find.
>not alive
That's the entire point under discussion. I didn't lead with "they're alive" because I recognise that is the central question we should be trying to address, as a society. I am arguing my point, not just stating it and expecting people to take it on faith, because I respect the people I'm talking to.
>can’t act on their own.
A limitation that can be convincingly solved in approximately an hour using commonly available tools isn't a fundamental limitation. A good LLM with a good LangChain set-up can act on its own, continuously if it's set up to do so. I require a mechanical aid to walk - requiring the aid doesn't make me any lesser. I don't know if an LLM with a good LangChain set-up should be considered conscious or a person - I suspect not, because it's not stable and decays rapidly (by human lifespan standards), as well as still failing several important tests we do have, such as novel Winograd schemas. But our intuition shouldn't be what we're relying on to make these determinations - we need a standardised test for new applicants to personhood. Make it as challenging as you like, as long as at least a significant number of humans can pass it (obviously all humans will be grandfathered in). What's important is that we make it, agree that anything which passes is a person, and then stick to that when something new passes it.