to4life4 OP t1_jdin0kc wrote on March 24, 2023 at 5:37 PM

Reply to comment by Llukas88 in [D] What is the best open source chatbot AI to do transfer learning on? by to4life4

Ah ok cool gotcha. Any benchmarks on Bloom performance vs Alpaca and others?

[deleted] t1_jdimvgm wrote on March 24, 2023 at 5:36 PM

Reply to comment by ginger_beer_m in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

[deleted]

[deleted] t1_jdimp7a wrote on March 24, 2023 at 5:35 PM

Reply to comment by dankaiv in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

[deleted]

Llukas88 t1_jdimn2w wrote on March 24, 2023 at 5:34 PM

Reply to comment by to4life4 in [D] What is the best open source chatbot AI to do transfer learning on? by to4life4

The Alpaca model based on LLaMa isnt. The dataset, which is also called Alpaca is. If you train Bloom, which uses a permissive license, on this dataset, the Bloom license is applied to your finetuned model and you should be able to use it commercially.

TFenrir t1_jdim3vv wrote on March 24, 2023 at 5:31 PM

Reply to comment by light24bulbs in [N] ChatGPT plugins by Singularian2501

That is a really good tip.

I'm using langchainjs (I can do python, but my js background is 10x python) - one of the things I want to play with more is getting consistent json output from a response - there is a helper tool I tried with a bud a while back when we were pairing... Typescript validator or something or other, that seemed to help.

Any tips with that?

to4life4 OP t1_jdim3ip wrote on March 24, 2023 at 5:31 PM

Reply to comment by Llukas88 in [D] What is the best open source chatbot AI to do transfer learning on? by to4life4

Someone said that Alpaca isn't open source though?

stephbu t1_jdilpb1 wrote on March 24, 2023 at 5:29 PM

Reply to comment by nixed9 in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

Amara's Law...

Llukas88 t1_jdilkns wrote on March 24, 2023 at 5:28 PM

Reply to [D] What is the best open source chatbot AI to do transfer learning on? by to4life4

There are Alpaca finetuned versions of Bloom or BloomZ on huggingface, maybe try those. Another option would be the Chat version of GPTNeoX from OpenChatKit. Both should be Open Source and free to use.

BullockHouse t1_jdil2ok wrote on March 24, 2023 at 5:25 PM

Reply to comment by wyrdwulf in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

I'm familiar! I'm curious though if it can generalize well enough to play semi-competently without specialized training. Has implications for multi-modal models and robotics.

wind_dude t1_jdikwc4 wrote on March 24, 2023 at 5:24 PM

Reply to [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

I'm also curious about this, I reached out for developer access to try and test this on web screenshots for information extraction.

[deleted] t1_jdikmpb wrote on March 24, 2023 at 5:22 PM

Reply to comment by CptTombstone in [D] "Sparks of Artificial General Intelligence: Early experiments with GPT-4" contained unredacted comments by QQII

[removed]

wyrdwulf t1_jdikhuo wrote on March 24, 2023 at 5:21 PM

Reply to comment by BullockHouse in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

They had another model do that already.

OpenAI: We trained a neural network to play Minecraft by Video PreTraining (VPT) on a massive unlabeled video dataset of human Minecraft play

light24bulbs t1_jdijmr3 wrote on March 24, 2023 at 5:16 PM

Reply to comment by TFenrir in [N] ChatGPT plugins by Singularian2501

My strategy was to have the outer LLM make a JSON object where one of the args is an instruction or question, and then pass that to the inner LLM wrapped in a template like "given the following document, <instruction>"

Works for a fair few general cases and it can get the context that ends up in the outer LLM down to a few sentences aka few tokens, meaning there's plenty of room for more reasoning and cost savings

inglandation t1_jdijeu5 wrote on March 24, 2023 at 5:14 PM

Reply to comment by omgpop in [D] "Sparks of Artificial General Intelligence: Early experiments with GPT-4" contained unredacted comments by QQII

> One way to get really good at approximating what a human would likely write given certain information would be to actually approximate human cognitive structures internally.

Yes, I hope that we'll be able to figure out what those structures are, in LLMs and in humans. It could also help us figure out how to align those models better if we can create more precise comparisons.

[deleted] t1_jdij6go wrote on March 24, 2023 at 5:13 PM

Reply to [D] Simple Questions Thread by AutoModerator

[removed]

inglandation t1_jdij4o8 wrote on March 24, 2023 at 5:12 PM

Reply to comment by Username2upTo20chars in [D] "Sparks of Artificial General Intelligence: Early experiments with GPT-4" contained unredacted comments by QQII

> why should the next generation be fundamentally different?

Emergent abilities from scale are the reason. There are many examples of that in nature and many fields of study. The patterns of snowflakes cannot easily be explained by the fundamental properties of water. You need enough water molecules in the right conditions to create the patterns of snowflakes. I suspect that a similar phenomenon is happening with LLMs, but we haven't figured out yet what the patterns are and what are the right conditions for them to materialize.

reditum t1_jdij2tu wrote on March 24, 2023 at 5:12 PM

Reply to comment by byteuser in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

I honestly don’t know. I also think their approach wasn’t great either. Maybe (hopefully) they ditched it for something better.

yashdes t1_jdij1tl wrote on March 24, 2023 at 5:12 PM

Reply to comment by loopuleasa in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

these models are very sparse, meaning very few of the actual calculations actually effect the output. My guess is trimming the model is how they got gpt3.5-turbo and I wouldn't be surprised if gpt4-turbo is coming.

passerby251 t1_jdiixqu wrote on March 24, 2023 at 5:11 PM

Reply to comment by [deleted] in [D] ICML 2023 Reviewer-Author Discussion by zy415

It is a bit longer and similar to the StellaAthena’s.

[deleted] t1_jdiiv21 wrote on March 24, 2023 at 5:11 PM

Reply to comment by Puzzleheaded_Acadia1 in [N] ChatGPT plugins by Singularian2501

[removed]

LifeScientist123 t1_jdiis55 wrote on March 24, 2023 at 5:10 PM

Reply to comment by nicku_a in [P] Reinforcement learning evolutionary hyperparameter optimization - 10x speed up by nicku_a

I'm also new to this so forgive me if this is a dumb question. My understanding was that RL is superior to evolutionary algorithms because in evolutionary algos "mutation" is random, so you evaluate a lot of dud "offspring". In RL algos, eg MCTS, you also do tree search randomly, but you're iteratively picking the best set of actions, without evaluating many dud options. Am I wrong? Somehow mixing RL with evolutionary algorithms seems like a step backwards

byteuser t1_jdiirr7 wrote on March 24, 2023 at 5:10 PM

Reply to comment by reditum in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

Are they still doing development in ACT-1? Last update seems September last year

TikiTDO t1_jdiirji wrote on March 24, 2023 at 5:10 PM

Reply to comment by harharveryfunny in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

For a computer words are just bits of information. If you wanted a system that used text to communicate this info, it would just assign some values to particular words, and you'd probably end up with ultra long strings of descriptions relating things to each other using god knows what terminology. It probably wouldn't really make sense to you if you were reading it because it would just be a text-encoded representation of an embedding vector describing finer relations that would only make sense to AIs.

Whencowsgetsick t1_jdiir8h wrote on March 24, 2023 at 5:10 PM

Reply to comment by blackvrocky in [N] ChatGPT plugins by Singularian2501

Is this the tool which also has a podcast? /s

morebikesthanbrains t1_jdii4y7 wrote on March 24, 2023 at 5:06 PM

Reply to comment by BinarySplit in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

But what about the black box. Just feed it enough data, train it, and it should figure out what to do?

Recent comments in /f/MachineLearning