Recent comments in /f/MachineLearning
MassiveIndependence8 t1_jdl9s3u wrote
Reply to comment by Single_Blueberry in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
The problem is that it can’t do math and spatial reasoning that well
MassiveIndependence8 t1_jdl9oq9 wrote
Reply to comment by ThirdMover in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
You’re actually suggesting putting every single frame into gpt-4? It’ll cost you a fortune after 5 seconds of running it. Plus the latency is super high, it might takes you an hour to process a “5 seconds” worth of images.
Vegetable-Skill-9700 OP t1_jdl8onp wrote
Reply to comment by Sorry-Balance2049 in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
Agreed! I don't expect it to be as good as GPT-4 on all tasks, but maybe fine-tuning for specific tasks can help it achieve similar performance on test samples related to that task. wdyt?
ttkciar t1_jdl8i7w wrote
LLaMa-7B output is abysmally horrible. We might need less than 100B, but not too much less.
Vegetable-Skill-9700 OP t1_jdl8hh5 wrote
Reply to comment by Blacky372 in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
Agreed, it won't generalize as well as GPT-4, but it could achieve similar performance for a specialized task (say answering technical questions around a certain topic or writing social media posts for a certain entity, etc.).
Sorry-Balance2049 t1_jdl7yn7 wrote
The databrick's blog post doesn't really show much eval on the model, only choice examples. It's more of a "hey we did this!" blog post.
blueSGL t1_jdl756z wrote
Reply to comment by Blacky372 in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
> with specialized expert data from literally 50 experts in various fields that worked on the response quality in their domain.
Sounds like a future goal for Open Assistant.
If one were being unethical... create a bot to post the current Open Assistant answers to technical questions in small specialist subreddits and wait for Cunningham's_Law to come into effect. (I'm only half joking)
Vegetable-Skill-9700 OP t1_jdl680d wrote
Reply to comment by soggy_mattress in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
That's an interesting analogy!
Blacky372 t1_jdl62vl wrote
GPT-J-6B with instruction finetuning will surely not ever be better than GPT-4. With RLHF you may reach a similar response quality in some contexts for some types of instruction, but you will never match the vast amounts of proprietary data that ClosedAI fed into a probably 250+B parameter model with specialized expert data from literally 50 experts in various fields that worked on the response quality in their domain. This cannot be surpassed easily, unfortunately. But maybe future open source models will be of similar capabilities with advanced training techniques. I would definitely hope so.
WarAndGeese t1_jdl5t0z wrote
Reply to comment by mxby7e in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
Boo hoo to openai, people should do it anyway. Is the terms of service the only reason not to do it or are there actual material barriers? If it's a problem of money then as long as people know how much money it can be crowdfunded. If it's a matter of people power then there are already large volunteer networks. Or is it just something that isn't practical or feasible?
WarAndGeese t1_jdl5aq6 wrote
Reply to comment by kromem in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
That would be pretty nuts and pretty cool. It's still a weird concept, but if it becomes like an operating system that you update, that would be a thing.
soggy_mattress t1_jdl4zkg wrote
I think of the 100b parameter models as analogous to the first room-sized computers that were built in the 70s. Seems the pattern is to first prove a concept, no matter how inefficiently, and then optimize it as much as possible.
cyborgsnowflake t1_jdl47n8 wrote
Reply to comment by Necessary-Meringue-1 in [D] "Sparks of Artificial General Intelligence: Early experiments with GPT-4" contained unredacted comments by QQII
I think the simpler answer is its easier than some people believed to reproduce certain knowledge tasks statistically than the alternative theory that shuffling tensors creates living thinking beings like everyone else on this thread seems to be jumping on board.
Educational_Ice151 t1_jdl47lq wrote
Hello Dolly. This look pretty interesting. I have been playing with creating cross model feedback loops that iterate for several cycles using few shot prompts and chain of thought models. This would work really well for my concept. I’ll likely publish my code in a day or two.
Shared to r/aipromptprogramming
dreamingleo12 t1_jdl3qgp wrote
It’s just a shameless copy of Stanford’s work. The innovative thing about Stanford Alpaca is it makes a ChatGPT style assistant with a language model, Meta LLaMA, and the cost is low. Databricks just followed Stanford’s approach and uses a different base model and claims it’s a big innovation. Alpaca actually can be fine-tuned with the same dataset in 3 hours and performs better than Databricks’ model.
A1-Delta t1_jdl325g wrote
Reply to comment by wojapa in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
GPT-J-6B fine tuned on Alpaca’s instruction dataset.
Vegetable-Skill-9700 OP t1_jdl2fbp wrote
Reply to comment by wojapa in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
I think it's just supervised training. Similar to alpaca, I guess
throwaway957280 t1_jdl2cq3 wrote
Reply to comment by RealSonZoo in [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501
If you pay for ChatGPT plus and manually select the new model, yes. By default, no.
wojapa t1_jdl23pj wrote
Did they use RLHF?
learn-deeply t1_jdl1bmp wrote
Reply to [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501
Anyone else tired of papers that obscure a simple concept with endless paragraphs of verbose gibberish? This 17 page could be a few sentences.
Tl;DR the authors wrote prompts to tell GPT-4 to fix code given some unit tests and the output of the broken code. It performs better than GPT-4 that doesn't have access to the output of the code execution.
https://github.com/noahshinn024/reflexion-human-eval/blob/main/reflexion.py#L7-L12
mxby7e t1_jdl18t6 wrote
Reply to comment by throwaway2676 in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
Maybe, open assistant by Stability.ai is doing this type of manual dataset collection. The training data and the model weights are supposed to be released once training is complete
throwaway2676 t1_jdl0y80 wrote
Reply to comment by mxby7e in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
Alpaca was only trained on 50k instructions, right? A large group of grad students or a forum like reddit could construct that many manually in a couple weeks. I'm surprised they even had to resort to using ClosedAI
blueSGL t1_jdl02u6 wrote
Reply to comment by liyanjia92 in [P] ChatGPT with GPT-2: A minimum example of aligning language models with RLHF similar to ChatGPT by liyanjia92
>So with GPT-2 medium, what we really do here is to parent a dumb kid, instead of a "supernaturally precocious child" like GPT-3. What interested me is that RLHF does actually help to parent this dumb kid to be more socially acceptable.
> In other words, if we discover the power of alignment and RLHF earlier, we might foresee the ChatGPT moment much earlier when GPT-2 is out in 2019.
That just reads to me as capability overhang. If there is "one simple trick" to make the model "behave" what's to say there that this is the only one. (or that the capabilities derived from the current behavior modification are the 'best they can be') Scary thought.
MassiveIndependence8 t1_jdla8px wrote
Reply to comment by H0lzm1ch3l in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
Not all api are public and LLM aren’t fined tune to process api