Recent comments in /f/MachineLearning
nightofgrim t1_jdehy1h wrote
Reply to comment by RedditLovingSun in [N] ChatGPT plugins by Singularian2501
I crafted a prompt to get ChatGPT to act as a home automation assistant. I told it what devices we have in the house and their states. I told it how to end any statement with one or more specially formatted commands to manipulate the accessories in the house.
It was just a fun POC, but it immediately became clear how much better this could be over Alexa or Siri.
I was able to ask it to do several things at once. Or be vague about what I wanted. It got it.
Jean-Porte t1_jdegeeq wrote
Reply to [N] ChatGPT plugins by Singularian2501
Barely a week after GPT-4 release. AI timeline is getting wild
[deleted] t1_jdefr21 wrote
Reply to comment by ZenDragon in [N] ChatGPT plugins by Singularian2501
[removed]
big_ol_tender t1_jdefnc7 wrote
Reply to comment by Difficult_Bid_9828 in [P] ChatLLaMA - A ChatGPT style chatbot for Facebook's LLaMA by imgonnarelph
thanks for this- very cool project, this indeed solves the issue with LLama weights but nfortunately the issue remains with the alpaca dataset license itself being non commercial:
https://github.com/tatsu-lab/stanford_alpaca/blob/main/DATA_LICENSE
sam__izdat t1_jdef39d wrote
Reply to comment by jcansdale2 in Modern language models refute Chomsky’s approach to language [R] by No_Draft4778
> What do you think of this exchange?
It's somewhat closer to a reasonable response than anything I could get out of it.
PantsuWitch t1_jdecyih wrote
light24bulbs t1_jdecutq wrote
Reply to [N] ChatGPT plugins by Singularian2501
I've been using langchain but it screws up a lot no matter how good of a prompt you write. For those familiar, it's the same concept as this, in a loop, so more expensive. You can run multiple tools though (or let the model run multiple tools, that is)
Having all that pretraining about how to use "tools" built into the model (I'm 99% sure that's what they've done) will fix that problem really nicely.
Icko_ t1_jdecnjx wrote
Reply to comment by edthewellendowed in [P] Open-source GPT4 & LangChain Chatbot for large PDF docs by radi-cho
Sure:
- Suppose you had 1 million embeddings of sentences, and one vector you want the closest sentence to. If the vectors were a single number, you could just do a binary search, and you'd be done. If they are higher dimensionality, it's a lot more involved. Pinecone is a paid product doing this. Faiss is a library by facebook, which is very good too, but is free.
- Recently, Facebook released the LLama models. They are large language models. ChatGPT is also a LLM, but after pretraining on a text corpus, you train it with human instructions, which is costly and time-consuming. Stanford took the LLama models, and trained them with ChatGPT. The result is pretty good not AS good, but pretty good. They called it "Alpaca".
Dependent_Ad5120 t1_jdec7kx wrote
Reply to comment by oathbreakerkeeper in [D] PyTorch 2.0 Native Flash Attention 32k Context Window by super_deap
I don't know. I was using pure fp16, no autocast and it works.
Nameless1995 t1_jdebfnu wrote
Reply to comment by passerby251 in [D] ICML 2023 Reviewer-Author Discussion by zy415
I recieved email notification.
lightyagami03 t1_jde9kv5 wrote
Reply to [D] Simple Questions Thread by AutoModerator
is it even worth trying to break into AI/ML now as a CS student or has everything already been/will be solved in the near future? like the jump from GPT3.5 to 4 was insane, soon GPT 5 will roll out and it'll be even better, and GPT6 might as well be AGI, at which point there wouldnt be anything to work towards
endless_sea_of_stars t1_jde88qi wrote
Reply to [N] ChatGPT plugins by Singularian2501
Wonder how this compares to the Toolformer implementation.
https://arxiv.org/abs/2302.04761
Their technique was to use few shot (in context) learning to annotate a dataset with API calls. They took the annotated dataset and used it to fine tune the model. During inference the code would detect the API call, make the call, and then append the results to the text and keep going.
The limitation with that methodology is that you have to fine tune the model for each new API. Wonder what OpenAIs approach is?
Edit:
I read through the documentation. Looks like it is done through in context learning. As in they just prepend the APIs description to your call and let the model figure it out. That also means you get charged for the tokens used in the API description. Those tokens also count against the context window. Unclear if there was any fine tuning done on the model to better support APIs or if they are just using the base models capabilities.
psdwizzard t1_jde850w wrote
Reply to [N] ChatGPT plugins by Singularian2501
A memory plug in would be amazing. it would allow it to learn.
radi-cho t1_jde80wh wrote
Reply to [N] ChatGPT plugins by Singularian2501
For people looking for open-source tools around the GPT-4 API, we're currently actively updating the list at https://github.com/radi-cho/awesome-gpt4. Feel free to check it out or contribute if you're a tool developer. I guess some of the ChatGPT plugins will be open-source as well.
wywywywy t1_jde6ltj wrote
Reply to comment by drunk-en-monk-ey in [N] ChatGPT plugins by Singularian2501
Yes but a lot of not-so-straight-forward things happened in the last few weeks already!
mcAlt009 t1_jde6kby wrote
Reply to [D] Simple Questions Thread by AutoModerator
What's the VM I can rent out with a GPU. Ideally I want a VM where I can train models, host websites, etc. Location isn't too important
Under_Over_Thinker t1_jde5e2f wrote
Reply to [R] Introducing SIFT: A New Family of Sparse Iso-FLOP Transformations to Improve the Accuracy of Computer Vision and Language Models by CS-fan-101
Perplexity going from 20.8 to 20.4. Is that a significant improvement? Also, I am not sure if perplexity is representative enough to evaluate LLMs.
ZenDragon t1_jde4uj8 wrote
Reply to comment by drunk-en-monk-ey in [N] ChatGPT plugins by Singularian2501
Agreed, but it's not like they have to implement everything all at once. Such integration would already be useful as soon as a small selection of the most basic features are working.
andrew21w t1_jde4ayx wrote
Reply to comment by underPanther in [D] Simple Questions Thread by AutoModerator
The thread you sent me says that polynomials are non discriminatory.
Are there other kinds of functions that are non discriminatory?
RedditLovingSun t1_jde2yvh wrote
Reply to comment by drunk-en-monk-ey in [N] ChatGPT plugins by Singularian2501
I'm not disagreeing with you but out of curiosity can you elaborate on any factors I may have overlooked?
drunk-en-monk-ey t1_jde2an2 wrote
Reply to comment by RedditLovingSun in [N] ChatGPT plugins by Singularian2501
It’s not so straight forward
brownmamba94 t1_jddzjgc wrote
Reply to comment by __Maximum__ in [R] Introducing SIFT: A New Family of Sparse Iso-FLOP Transformations to Improve the Accuracy of Computer Vision and Language Models by CS-fan-101
Thanks for your inquiry. We are working with our legal team to figure out the best path forward, but most likely, we'll be releasing under some permissive license that allows you to use the code for your applications.
Nameless1995 t1_jddyw6i wrote
Reply to comment by elegantrium in [D] ICML 2023 Reviewer-Author Discussion by zy415
Sounds like a good chance.
Difficult_Bid_9828 t1_jddys16 wrote
Reply to comment by big_ol_tender in [P] ChatLLaMA - A ChatGPT style chatbot for Facebook's LLaMA by imgonnarelph
I think this might be exactly what you're looking for https://github.com/declare-lab/flan-alpaca
race2tb t1_jdeilah wrote
Reply to [N] ChatGPT plugins by Singularian2501
Just like google search every other way we do things is going to change. Why do I need a website if I can just feed model my info have it generate everything when people want my content. Things are going to be completely rethought because of natural language to generative ai. We used to be the ones that had to maintain these things and build the content, now we do not really have to. All we need to do is make sure the AI stays well fed and have the links to any data it has to present which it cannot store.