Recent comments in /f/MachineLearning
evanthebouncy OP t1_jdxx14h wrote
Reply to comment by [deleted] in [P] two copies of gpt-3.5 (one playing as the oracle, and another as the guesser) performs poorly on the game of 20 Questions (68/1823). by evanthebouncy
try it and let me know
ypxkap t1_jdxwirl wrote
Reply to comment by yaosio in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
the bing chat thing is interesting because it can’t seem to tell when it can’t see the whole page, eg if you ask it “what’s the last line of this webpage” you’ll get some line x words in (usually ~1100 words for me but it’s been awhile since i checked). if you then send text from after the “last sentence”, it will act like it’s been looking at it the whole time, but as far as i can tell it has no capacity to notice the text otherwise. i asked it to summarize a chat log txt file i had loaded into edge and it included in the summary that there was an advertisement for an iphone 14 and also that “user threatened to harm the AI”, neither of which were present in the text file. that gives me the impression that it’s seeing something completely different from what edge is displaying that also includes instructions over how to respond in some scenarios including being threatened?
VectorSpaceModel t1_jdxwb8w wrote
CacheMeUp t1_jdxvq8t wrote
Reply to comment by hadaev in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
Perhaps the challenge is not the size of the internet (it's indeed big and easy to generate new content), but rather the uniqueness and novelty of the information. Anecdotally, looking at the first page of Google results often shows various low-informativeness webpages, where only a few sentences provide information and the rest is boilerplate, disclaimers, generic advice or plain spam.
benevolentpenguin t1_jdxuphs wrote
Reply to comment by mil24havoc in [D] Can we train a decompiler? by vintergroena
Also: "Augmenting Decompiler Output with Learned Variable Names and Types" by Chen and Lacomis et al. https://jeremylacomis.com/materials/ChenDIRTY2022.pdf
Rioghasarig t1_jdxs956 wrote
Reply to comment by Cool_Abbreviations_9 in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
I really don't think your experiment makes much sense. Even if we could determine the confidence level of GPT there's no reason to believe asking it for its confidence level is an effective way of determining the actual confidence. As other people have asked the obvious question is "what's your confidence on these confidence reports"? The logic is baseless.
ninjasaid13 t1_jdxrw65 wrote
Reply to comment by wind_dude in [D] Instruct Datasets for Commercial Use by JohnyWalkerRed
They're going to open source it on April 15 last I heard. They're still gathering with the cut off date at April 12.
Rioghasarig t1_jdxrp3y wrote
Reply to comment by astrange in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
No they were right about with he base model of GPT. As the base model was trained simply to predict the next word. ChatGPT and GPT4 have evolved beyond that (with things like RLHF).
wind_dude t1_jdxrcpp wrote
>depend on the Alpaca dataset, which was generated from a GPT3 davinci model, and is subject to non-commercial use
Where do you get that? tatsu-lab/stanford_alpaca is apache 2.0, so you can use it for whatever.
​
for OpenAI
"""
(c) Restrictions. You may not (i) use the Services in a way that infringes, misappropriates or violates any person’s rights; (ii) reverse assemble, reverse compile, decompile, translate or otherwise attempt to discover the source code or underlying components of models, algorithms, and systems of the Services (except to the extent such restrictions are contrary to applicable law); (iii) use output from the Services to develop models that compete with OpenAI; (iv) except as permitted through the API...
"""
​
So as far as I'm concerned you are allowed to use the generated dataset for commercial purposes...
​
Only use might be the licensing on the llama models... but you can train another LLM
[deleted] t1_jdxr7q7 wrote
Reply to comment by GullibleEngineer4 in [P] 🎉 Announcing Auto-Analyst: An open-source AI tool for data analytics! 🎉 by aadityaubhat
[deleted]
wind_dude t1_jdxqp0v wrote
Reply to comment by Taenk in [D] Instruct Datasets for Commercial Use by JohnyWalkerRed
Last I checked they still hadn't opensourced the training data... which is bizarre since they used humans to train it, with all the talk of it being opensource.
CatalyzeX_code_bot t1_jdxqbho wrote
Reply to [D] FOMO on the rapid pace of LLMs by 00001746
Found relevant code at https://github.com/microsoft/LoRA + all code implementations here
--
To opt out from receiving code links, DM me
lhenault OP t1_jdxpuso wrote
Reply to comment by onequark in [P] SimpleAI : A self-hosted alternative to OpenAI API by lhenault
Not for now but that’s indeed a cool feature and something available in OpenAI API. It shouldn’t be too hard to implement, as I’ve already started something for that on the gRPC backend, and as FastAPI has a StreamingResponse. Thanks for suggesting it, will try to prioritise this!
RageOnGoneDo t1_jdxoqxf wrote
Reply to comment by Peleton011 in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
How, though? How can an LLM do that kind of statistical analysis?
Peleton011 t1_jdxolt1 wrote
Reply to comment by RageOnGoneDo in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
I mean, i said LLMs definetely could do that, i never intended to convey that that's what's going on in OPs case or that chatgpt specifically is able to do so.
GullibleEngineer4 t1_jdxoipq wrote
Reply to [P] 🎉 Announcing Auto-Analyst: An open-source AI tool for data analytics! 🎉 by aadityaubhat
Can it be a ChatGPT plugin?
[deleted] t1_jdxogk1 wrote
Reply to comment by aadityaubhat in [P] 🎉 Announcing Auto-Analyst: An open-source AI tool for data analytics! 🎉 by aadityaubhat
[deleted]
[deleted] t1_jdxo3rm wrote
slipknot25 t1_jdxn5ed wrote
Reply to [D] 3d model generation by konstantin_lozev
Something already here : https://instruct-nerf2nerf.github.io/
RageOnGoneDo t1_jdxm91o wrote
Reply to comment by Peleton011 in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
Why are you assuming it's actualyl doing that calculation, though?
OnlyAnalyst9642 t1_jdxlki8 wrote
Reply to [D] Simple Questions Thread by AutoModerator
I have a very specific problem where I am trying to forecast tomorrow's electricity price with an hourly resolution (from tomorrow at midnight to tomorrow at 11pm). I need to forecast prices before 10AM today. Electricity prices have very strong seasonality (24 hours) and I am using the whole day of yesterday and today up to 10AM as an input to the model (an input of 34 hours). In tensorflow terms (https://www.tensorflow.org/tutorials/structured_data/time_series) my input width is 34, the offset is 14 and the label width is 24.
Since I only care about the predictions I get at 10AM for the following day, should I only train my model with the observations available at 10am?
I am pretty sure this has been addressed before. Any documentation/resources that consider similar problems would help
Thanks in advance!
Unlikely_Hamster_935 t1_jdxk1mc wrote
Reply to [D] ICML 2023 Reviewer-Author Discussion by zy415
How can we contact the ACs now? There is no official review button anymore...
__scan__ t1_jdxj30g wrote
Reply to comment by robobub in [D] GPT4 and coding problems by enryu42
This is what will happen if we’ve either a) exhausted demand, or b) made software development much easier such that people who previously couldn’t do it now can.
The first was likely true for accountants, but is less obviously so for software — there’s still vastly more useful software to build than actually gets built, and each piece of new software that gets built generally increases that demand.
Perhaps the second is true though — do you foresee enough non-developers being able to write, deploy, maintain, and operate production systems as a result of LLMs (in a way that high level languages and previous tooling didn’t)? If not, or if not in sufficient numbers, maybe what happens is that software developers become more in demand than ever due to their productivity increases resulting in even demand for more software (because they can write it quicker).
rshah4 t1_jdxhz3d wrote
Reply to comment by big_ol_tender in [D] Instruct Datasets for Commercial Use by JohnyWalkerRed
I agree with the sentiments here and don’t think it’s ok to use some of these datasets that appear to violate OpenAIs terms. I dealt with it by making a funny video: https://youtu.be/31u88EDmIwc
nxqv t1_jdxx53i wrote
Reply to [D] FOMO on the rapid pace of LLMs by 00001746
I don't know a whole lot about LLMs because I'm new to the field but I sure do know about FOMO. I recently felt a lot of FOMO about having missed opportunities to path towards graduate school and AI research years ago.
What you need to do is put a name to the face. Dig deep and understand your feelings better.
What is it you're afraid of missing out on exactly?
Untold riches? Researchers don't really make any more or less money than other computer science jobs. And most billionaires aren't following some predetermined path.
Fame? Clout? We can't all be Sam Altman or Yann LeCun or Eliezer Yudkowsky or whoever. Besides, most of the things you see these types of guys say or do in public is only tangentially related to the day to day experience of actually being them.
Impact? I've recently come to realize that a craving for "impact" is often rooted in a desire for one of these other things, or rooted in some sort of egotistical beliefs or other deep seated psychological matter like seeking someone's approval. In reality, you could be the guy who cures cancer and most regular people would only think about you for half a second, your peers could be jealous freaks, and people could still find some tiny little reason to turn on you if they really wanted to. You could easily die knowing you did something amazing for the world and nobody cared but you. Are you the type of person who would be okay with that?
Edit: the "Impact" part was controversial so I'd like to add:
> don't lose sight of the forest because of a tree. We're talking about impact in the context of FOMO - if you feel that level of anxiety and rush about potentially missing out on the ability to make an impact because others are already making the impact you want to make, it's more likely to be ego-driven than genuine altruism
The ability to work on something cool or trendy? There's SO MANY new technologies out there you can path towards a career in. And there will continue to be something cool to do for as long as humanity exists.
Something else?
For each one of these, you can come up with convincing counterarguments for either why it's not real or why you can just find a similar opportunity doing many other things.
And let's be real for a second, if this technology really is going to take knowledge workers' jobs, researchers are probably on the chopping block too.