Recent comments in /f/MachineLearning
DiscussionGrouchy322 t1_jdmrq88 wrote
Reply to [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501
Wow so many words to try and say you're applying test driven design to prompt engineering. I will keep this as example of how not to write technical content. (I was reading the "blog post")
Maybe this is a joke posting that was also written by the chat gpt.
When you make those charts with the weights and things... Are they meant to convey information or do you just follow previous template where you saw information presented that way and you just try and match the shape?
gamerx88 t1_jdmrlhh wrote
Reply to comment by wojapa in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
No, check their git repo. They used HF transformer's AutoFromCausalLM in their training script. It's supervised fine-tuning.
Anis_Mekacher OP t1_jdmrjpc wrote
Reply to comment by These-Assignment-936 in [D] Keeping track of ML advancements by Anis_Mekacher
That's a great idea. Is it something like a bi/weekly meeting where you get to explain the main concepts and ideas behind a paper in a short amount of time?
It won't work in my case, because my current job is more or less in the cybersecurity field and not a lot of people in my company are interested in AI or its developments.
gamerx88 t1_jdmr4n2 wrote
Reply to comment by wojtek15 in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
My observations are similar to yours, but I think Stanford's claim was that it rivalled text-davinci-003's dialogue or chat capabilities, and only in a single turn setting.
gamerx88 t1_jdmql8y wrote
Answer is probably not. DeepMind's Chinchilla paper shows that many of those 100B+ LLMs are oversized for the amount of data used to pre-train them.
artsybashev t1_jdmpwwd wrote
Reply to comment by danielbln in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501
The fluffy overly complex writing around your main message has worked as a barrier or prefilter to filter out bad job candidates or unqualified contributions to scientific discussion. LLMs are destroying this part. Interesting to see what this leads to.
These-Assignment-936 t1_jdmpulh wrote
Reply to [D] Keeping track of ML advancements by Anis_Mekacher
We have a book/paper club going at work where the engineers present a recent publication
gamerx88 t1_jdmpdtf wrote
Reply to comment by ginger_beer_m in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
Not if they adopt the technology
marcus_hk t1_jdmpb8u wrote
phb07jm t1_jdmp7kc wrote
Reply to comment by Short_Change in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
I think this will prove prophetic
Veggies-are-okay t1_jdmopy0 wrote
Does anyone have a good resource/video on the overview of these implementations? I don’t work much with language models but figure it might be good to understand where this is but I’m just running into the buzz feed-esque surface level nonsense on YouTube.
alexmin93 t1_jdmocbw wrote
Reply to [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
The problem is that LLMs aren't capable to make decisions. While GPT-4 can chat almost like a sentient being, it's not sentient at all. It's not able to coprehend the limitations of it's knowledge and capabilities. It's extremely hard to make it call an API to ask for more context. There's no way it will be good at using a computer like a user. It can predict what wappens if you do something but it won't be able to take some action. It's a dataset limitation mostly, it's relatively easy to train language models as there's almost infinite ammount of text on the Internet. But are there any condition-action kind of datasets? You'd need to observe human behavior for millenias (or install some tracker software on thousands of workstations and observe users behavior for years)
gamerx88 t1_jdmndip wrote
Food for thought. Is this really surprising considering that the InstructGPT paper in early 2022, already showed how even a 1.3B model after RLHF could beat a much larger 175B model?
I guess what this shows is that it's the data that matters rather than SFT vs RLHF. Wondering if any ablation studies have been done here.
Freedirt1337 t1_jdmn2se wrote
Reply to [D] Do you use a website or program to organise and annotate your papers? by who_here_condemns_me
Obsidian with the new canvas feature is awesome. ResearchRabbit and Elicit have been useful too
machineko t1_jdmm43b wrote
Reply to comment by SWESWESWEh in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
Thanks for the comment. Are you looking to run on M2 or smaller edge devices?
alrunan t1_jdmm3lw wrote
Reply to comment by harharveryfunny in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
The 7B model is trained on 1T tokens and performs really well for its number of parameters.
sweatierorc t1_jdmkacg wrote
Reply to comment by SmLnine in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501
Sure, humans under 40 are also very resistant to cancer. My point was that cancer comes with old age, and aging seems to be a way for us to die before cancer or dementia kill us. There are "weak" evidence that people who have dementia are less likely to get a cancer. I understand that some mammals like whales or elephant seems to be very resistant to cancer, but if we were to double or triple their average life expectancy, other disease may become more prevalent, maybe even cancer.
[deleted] t1_jdmjvww wrote
Reply to comment by machineko in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
[removed]
MarmonRzohr t1_jdmj8th wrote
Reply to comment by SmLnine in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501
>There are complex mammals that effectively don't get cancer
You got a source for that ?
That's not true at all according everything I know, but maybe what I know is outdated.
AFAIK there are only mammals that seem to develop cancer much less than they should - namely large mamals like whales. Other than that every animal above and including Cnidaria deveop tumors. E.g. even the famously immortal Hydras develop tumors over time.
That's what makes cancer so tricky. There is good chance that far, far back in evolution there was a selection between longevity and rate of change or something else. Therefore may be nothing we can do to prevent cancer and can only hope for suppression / cures when / if it happens.
Again, this may be outdated.
sweatierorc t1_jdmilbm wrote
Reply to comment by Art10001 in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501
Do we know that ? E.g. with quantum computing, we know that it won't really revolutionize our lives despite the fact that it can solve a new class of problem.
lego3410 t1_jdmi0hv wrote
Reply to comment by learn-deeply in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501
Yes! But GPT-4 could summarize it for me.
RiotSia t1_jdmhn6h wrote
Reply to [D] Simple Questions Thread by AutoModerator
Hey,
I got the 7B llama model running on my machine. Now I want it to analyze a large text for me (a pdf file) like hamata.ai does. How can I do it ? Does any one has like a site with resources on how I can learn to do that or even tell me?
DarkTarantino t1_jdmgpqq wrote
Reply to comment by ZestyData in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai
🤣🤣 shiiit at this point you never know
These-Assignment-936 t1_jdmrqcp wrote
Reply to comment by Anis_Mekacher in [D] Keeping track of ML advancements by Anis_Mekacher
Yes. Weekly. But we have a huge team.
There’s always people interested. Maybe not always AI engineers.
You could do a series where people summarize accessible papers. A lot of literature review papers are readable even by a layperson