light24bulbs t1_jdm04sx wrote on March 25, 2023 at 11:49 AM

Reply to [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

Are those it? Surely there's a bunch more notable open source ones?

michaelthwan_ai OP t1_jdlzwyi wrote on March 25, 2023 at 11:47 AM

Reply to comment by wywywywy in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

Sure I think it is clear enough to show parents of recent model (instead of their grand grand grand parents..

If people want, I may consider to make a full one (including older one)

michaelthwan_ai OP t1_jdlztvv wrote on March 25, 2023 at 11:46 AM

Reply to comment by addandsubtract in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

It is a good model but it's about one year ago, and not related to recent released LLM. Therefore I didn't add (otherwise a tons of good models).
For dolly, it is just ytd. I didn't have full info of it yet

michaelthwan_ai OP t1_jdlzq6i wrote on March 25, 2023 at 11:45 AM

Reply to comment by Historical-Tree9132 in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

I considered haha but I have no evidence

[deleted] t1_jdlzp51 wrote on March 25, 2023 at 11:44 AM

Reply to comment by Puzzleheaded_Acadia1 in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

[deleted]

maizeq t1_jdlzhql wrote on March 25, 2023 at 11:42 AM

Reply to comment by michaelthwan_ai in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

Would be useful to distinguish between SFT and RLHF tuned models

wywywywy t1_jdlz40b wrote on March 25, 2023 at 11:37 AM

Reply to comment by addandsubtract in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

GPT-J & GPT-Neo are predecessors of GPT-NeoX 20b

MINECRAFT_BIOLOGIST t1_jdlz2nr wrote on March 25, 2023 at 11:37 AM

Reply to comment by MarmonRzohr in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501

Hmm, perhaps I was being a bit hyperbolic, but check this out (from 2021):

https://www.science.org/content/article/mouse-embryos-grown-bottles-form-organs-and-limbs

hadaev t1_jdlym7s wrote on March 25, 2023 at 11:32 AM

Reply to comment by Crystal-Ammunition in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700

Idk, internet is big.

MarmonRzohr t1_jdlyfub wrote on March 25, 2023 at 11:30 AM

Reply to comment by MINECRAFT_BIOLOGIST in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501

>artificial wombs are basically done or very close

Bruh... put down the hopium pipe. There's a bit more work to be done there - especially if you think "artifical womb" as in from conception to term, not artifical womb as in device intended from prematurely born babies.

The second one was what was demonstrated with the lamb.

DarkTarantino t1_jdly8y7 wrote on March 25, 2023 at 11:27 AM

Reply to [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

If I wanted to create graphs like these for work what is that role called

tdgros t1_jdlxy8a wrote on March 25, 2023 at 11:24 AM

Reply to comment by shanereid1 in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700

There are versions for NLP (and a special one for vision transformers), here is the BERT one from some of the same authors (Frankle & Carbin) https://proceedings.neurips.cc/paper/2020/file/b6af2c9703f203a2794be03d443af2e3-Paper.pdf

It is still costly, as it includes rewinding and finding masks, we probably need to switch to dedicated sparse computations to fully benefit from it.

SmLnine t1_jdlxego wrote on March 25, 2023 at 11:17 AM

Reply to comment by sweatierorc in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501

There are complex mammals that effectively don't get cancer, and there are less complex animals and organisms that effectively don't age. So I'm curious what your opinion is based on.

Puzzleheaded_Acadia1 t1_jdlx6ea wrote on March 25, 2023 at 11:14 AM

Reply to comment by addandsubtract in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

Is gpt-j 6b really better than alpaca 7b and Wich run faster

Puzzleheaded_Acadia1 t1_jdlx1g3 wrote on March 25, 2023 at 11:13 AM

Reply to comment by gopher9 in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

What is RWKV?

SeymourBits t1_jdlwrgi wrote on March 25, 2023 at 11:09 AM

Reply to comment by itsnotlupus in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

This is the most accurate comment I've come across. The entire system is only as good and granular as the CLIP text description that's passed into GPT-4 which then has to "imagine" the described image, often with varying degrees of hallucinations. I've used it and can confirm it is currently not possible to operate anything close to a GUI with the current approach.

SpiritualTwo5256 t1_jdlwq53 wrote on March 25, 2023 at 11:09 AM

Reply to [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700

How many specialized neurons are there in a human brain?

[deleted] t1_jdlwka7 wrote on March 25, 2023 at 11:07 AM

Reply to comment by Crystal-Ammunition in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700

[removed]

SmLnine t1_jdlwhtu wrote on March 25, 2023 at 11:06 AM

Reply to comment by nonotan in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501

>but unless you take the philosophical stance that "if we just made AGI they'd be able to solve every problem we have, so everything is effectively an ML problem", it doesn't seem like it'd be fair to say the bottlenecks to solving either of those are even related to ML in the first place. It's essentially all a matter of bioengineering coming up with the tools required.

We're currently using our brains (a general problem solver) to build bioengineering tools that can cheaply and easily edit the DNA of a living organism. 30 years ago this would have sounded like magic. But there's no magic here. This potential tool has always existed, we just didn't understand it.

It's possible that there are other tools in the table that we simply don't understand yet. Maybe what we've been doing the last 60 years is the bioengineering equivalent of bashing rocks together. Or maybe it's close to optimal. We don't know, and we can't know until we aim an intellectual superpower at it.

Puzzleheaded_Acadia1 t1_jdlw72w wrote on March 25, 2023 at 11:02 AM

Reply to [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501

Can someone explain to me what this paper is about

addandsubtract t1_jdlvmm6 wrote on March 25, 2023 at 10:54 AM

Reply to [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

Where does GPT-J and dolly fall into this?

[deleted] t1_jdlurrw wrote on March 25, 2023 at 10:43 AM

Reply to comment by Deep-Station-1746 in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501

[deleted]

maskedpaki t1_jdlu3k1 wrote on March 25, 2023 at 10:34 AM

Reply to comment by nekize in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501

well at least you can use gpt4 for padding now.

[deleted] t1_jdltev5 wrote on March 25, 2023 at 10:24 AM

Reply to comment by Vegetable-Skill-9700 in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700

[removed]

shanereid1 t1_jdlt38a wrote on March 25, 2023 at 10:19 AM

Reply to [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700

Have you read about the lotto ticket hypothesis? It was a paper from a few years ago that showed that within a fully connected neural network there exists a smaller sub network that can perform equally as well, even when the subnetwork is as low as a few % of the size of the original network. AFAIK they only proved this for MLP and CNNs. Its almost certain that the power of these LLMs can be distilled in some fashion without significantly degrading performance.

Recent comments in /f/MachineLearning