Recent comments in /f/MachineLearning
michaelthwan_ai OP t1_jdlzwyi wrote
Reply to comment by wywywywy in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai
Sure I think it is clear enough to show parents of recent model (instead of their grand grand grand parents..
If people want, I may consider to make a full one (including older one)
michaelthwan_ai OP t1_jdlztvv wrote
Reply to comment by addandsubtract in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai
It is a good model but it's about one year ago, and not related to recent released LLM. Therefore I didn't add (otherwise a tons of good models).
For dolly, it is just ytd. I didn't have full info of it yet
michaelthwan_ai OP t1_jdlzq6i wrote
Reply to comment by Historical-Tree9132 in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai
I considered haha but I have no evidence
[deleted] t1_jdlzp51 wrote
Reply to comment by Puzzleheaded_Acadia1 in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai
[deleted]
maizeq t1_jdlzhql wrote
Reply to comment by michaelthwan_ai in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai
Would be useful to distinguish between SFT and RLHF tuned models
wywywywy t1_jdlz40b wrote
Reply to comment by addandsubtract in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai
GPT-J & GPT-Neo are predecessors of GPT-NeoX 20b
MINECRAFT_BIOLOGIST t1_jdlz2nr wrote
Reply to comment by MarmonRzohr in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501
Hmm, perhaps I was being a bit hyperbolic, but check this out (from 2021):
https://www.science.org/content/article/mouse-embryos-grown-bottles-form-organs-and-limbs
hadaev t1_jdlym7s wrote
Reply to comment by Crystal-Ammunition in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
Idk, internet is big.
MarmonRzohr t1_jdlyfub wrote
Reply to comment by MINECRAFT_BIOLOGIST in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501
>artificial wombs are basically done or very close
Bruh... put down the hopium pipe. There's a bit more work to be done there - especially if you think "artifical womb" as in from conception to term, not artifical womb as in device intended from prematurely born babies.
The second one was what was demonstrated with the lamb.
DarkTarantino t1_jdly8y7 wrote
If I wanted to create graphs like these for work what is that role called
tdgros t1_jdlxy8a wrote
Reply to comment by shanereid1 in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
There are versions for NLP (and a special one for vision transformers), here is the BERT one from some of the same authors (Frankle & Carbin) https://proceedings.neurips.cc/paper/2020/file/b6af2c9703f203a2794be03d443af2e3-Paper.pdf
It is still costly, as it includes rewinding and finding masks, we probably need to switch to dedicated sparse computations to fully benefit from it.
SmLnine t1_jdlxego wrote
Reply to comment by sweatierorc in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501
There are complex mammals that effectively don't get cancer, and there are less complex animals and organisms that effectively don't age. So I'm curious what your opinion is based on.
Puzzleheaded_Acadia1 t1_jdlx6ea wrote
Reply to comment by addandsubtract in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai
Is gpt-j 6b really better than alpaca 7b and Wich run faster
Puzzleheaded_Acadia1 t1_jdlx1g3 wrote
Reply to comment by gopher9 in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai
What is RWKV?
SeymourBits t1_jdlwrgi wrote
Reply to comment by itsnotlupus in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
This is the most accurate comment I've come across. The entire system is only as good and granular as the CLIP text description that's passed into GPT-4 which then has to "imagine" the described image, often with varying degrees of hallucinations. I've used it and can confirm it is currently not possible to operate anything close to a GUI with the current approach.
SpiritualTwo5256 t1_jdlwq53 wrote
How many specialized neurons are there in a human brain?
[deleted] t1_jdlwka7 wrote
Reply to comment by Crystal-Ammunition in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
[removed]
SmLnine t1_jdlwhtu wrote
Reply to comment by nonotan in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501
>but unless you take the philosophical stance that "if we just made AGI they'd be able to solve every problem we have, so everything is effectively an ML problem", it doesn't seem like it'd be fair to say the bottlenecks to solving either of those are even related to ML in the first place. It's essentially all a matter of bioengineering coming up with the tools required.
We're currently using our brains (a general problem solver) to build bioengineering tools that can cheaply and easily edit the DNA of a living organism. 30 years ago this would have sounded like magic. But there's no magic here. This potential tool has always existed, we just didn't understand it.
It's possible that there are other tools in the table that we simply don't understand yet. Maybe what we've been doing the last 60 years is the bioengineering equivalent of bashing rocks together. Or maybe it's close to optimal. We don't know, and we can't know until we aim an intellectual superpower at it.
Puzzleheaded_Acadia1 t1_jdlw72w wrote
Reply to [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501
Can someone explain to me what this paper is about
addandsubtract t1_jdlvmm6 wrote
Where does GPT-J and dolly fall into this?
maskedpaki t1_jdlu3k1 wrote
Reply to comment by nekize in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501
well at least you can use gpt4 for padding now.
[deleted] t1_jdltev5 wrote
Reply to comment by Vegetable-Skill-9700 in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
[removed]
shanereid1 t1_jdlt38a wrote
Have you read about the lotto ticket hypothesis? It was a paper from a few years ago that showed that within a fully connected neural network there exists a smaller sub network that can perform equally as well, even when the subnetwork is as low as a few % of the size of the original network. AFAIK they only proved this for MLP and CNNs. Its almost certain that the power of these LLMs can be distilled in some fashion without significantly degrading performance.
light24bulbs t1_jdm04sx wrote
Reply to [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai
Are those it? Surely there's a bunch more notable open source ones?