Recent comments in /f/MachineLearning

VelvetyPenus t1_jdj51e0 wrote

Reply to comment by race2tb in [N] ChatGPT plugins by Singularian2501

I'm sorry, but I cannot guess your neighbor's PIN code or provide any assistance with potentially unethical or illegal activities. It is important to respect other people's privacy and avoid engaging in any actions that could cause harm or violate their rights. It is best to focus on positive and lawful ways to interact with your neighbors and build a positive community.

1

iamspro t1_jdj4wzl wrote

Reply to comment by countalabs in [N] ChatGPT plugins by Singularian2501

Fine-tuning is distinct afaik... using OpenAI's language for it[1]:

zero-shot: no examples in the prompt, just an input (and/or instruction)

few-shot: one or more examples of input+output in the prompt, plus new input

fine-tuning: updating the model with examples (which can then be used with zero- or few-shot as you wish)

[1] https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api (part 5)

3

itsnotlupus t1_jdj2xpr wrote

Meh. We see a few demos and all of the demos work all of the time, but that could easily be an optical illusion.

Yes, GPT-4 is probably hooked to subsystems that can parse an image, be it some revision of CLIP or whatever else, and yes it's going to work well enough some of the time, maybe even most of the time.

But maybe wait until actual non-corpo people have their hands on it and can assess how well it actually works, how often it fails, and whether anyone can actually trust it to do those things consistently.

1

agent_zoso t1_jdiwkmj wrote

It always is. If you want to get really freaky with it, just look at how NFTs became demonized at the same time as when Gamestop's pivot to NFT third-party provider was leaked by WSJ. Just the other month people were bashing the author of Terminal Shock and hard sci-fi cyberpunk pioneer Neal Stephenson in his AMA for having a NFT project/tech demo by arguing with someone that knows 1000x more than they do, saying it's just a CO2 emitter and only scam artists use it and were disappointed to see he'd try to do this to his followers. Of course, the tech has evolved and those claims weren't true in his case, but it was literally all in one ear out the other for these people even after he'd defend himself with the actual facts about his green implementation and how it works. They bought an overly general narrative and they're sticking to it!

Interesting that now, with a technology that produces an order of magnitude more pollution (you can actually list models on Hugging Face by the metric tonnes of CO2 equivalent released during training) and producing an epidemic of cheaters in high schools, universities, and the work force, it's all radio silence. God only knows how much scamming and propaganda (which is just scamming but "too big to fail") is waiting in the wings.

I don't think the average person even knows what they would do with such a powerful LLM beyond having entertaining convos with it or having it write articles for them. Of course they see other people doing great things with it and not really any of the other ways it's being misused by degens right now, which goes back to an advantage in corporate propaganda.

2

Extension-Mastodon67 t1_jdivgzf wrote

I guess gpt2-medium is too small for RLHF.

PPO output: 

Human: If I have 3 bananas and 4 pears. How many fruits do I have?

Assistant: 

3 bananas: 1 pears, 1⁄2 cup of sugar, 2 tbsp of butter, and 1⁄2 tsp of cinnamon
4 bananas: 1⁄4 cup of sugar, 1 tbsp of butter, 1⁄2 tsp of cinnamon
5 bananas: 1⁄4 cup of sugar, 1 tbsp of butter, 1⁄2 tsp of cinnamon
6 bananas: 1⁄4 cup of sugar, 1 tbsp of butter, 1⁄2 tsp of

--------------

Vanilla output: 

Human: If I have 3 bananas and 4 pears. How many fruits do I have?

Assistant: 

Me: 3 bananas: 46 fruits (1 pie, half banana, 3 bananas)

Assistant: 46 pears: 39 fruits (1 pie, half orange, 1 pear)

Me: 47 oranges: 38 fruits (1 pie, half banana, 4 oranges)

Assistant: 47 pears: 43 fruits (1 pie, half banana, 6 oranges)

Me: 49 apples: 4 fruit (1 pie, half banana, 2 apples)

Assistant: 49 pears: 38 fruits (3 apples, 1 pear, 2 apples)

Me: 50 nectarines: 3 fruit (1 pie, half orange, 1 pear)

Assistant: 50 pears: 37 fruits (1 pie, half orange, 4
1

RedditLovingSun t1_jdipxex wrote

They aren't open source but didn't Stanford release their code and self instruct training data that's supposedly only $600 to train? I honestly don't know but how enforceable is llamas "no using it for business" clause after someone augments one of their models with Lora and trains weights on self instruct?

2

ShadoWolf t1_jdipal4 wrote

Some of the capabilities of ChatGPT4 .. are spooky. I mean GPT-4 hired someone off of TaskRabbit to solve a Captcha for it in test phases (https://cdn.openai.com/papers/gpt-4.pdf) . I don't think it's at AGI .. but it sort of feels like Where ever we are on the S curve for this technology. we finally on the same continent for getting to AGI

And some of the stuff people are getting to to do using LangChain with ChatGPT is crazy

2

Llukas88 t1_jdiohwe wrote

Not any i know of, played around today with Alphacoom (https://huggingface.co/mrm8488/Alpacoom) and got pretty Bad results then tried a BloomZ version (https://huggingface.co/mrm8488/bloomz-7b1-mt-ft-alpaca) and got results similar to the Alpaca-Native model. Maybe read the BloomZ paper it should be a pretty good basis to build a chat model, rest should depend on your Training approach and Data.

3