Recent comments in /f/MachineLearning
thomasahle t1_je14a0c wrote
Reply to [D] Simple Questions Thread by AutoModerator
Are there any "small" LLMs, like 1MB, that I can include, say, on a website using ONNX to provide a minimal AI chat experience?
TheEdes t1_je149kf wrote
Reply to comment by MrFlamingQueen in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
Yeah but if you were to come up with a problem in your head that didn't exist word for word then GPT-4 would be doing what they're advertising, however, if the problem was word for word anywhere in the training data then the testing data is contaminated. If the model can learn the design patterns for leetcode style questions by looking at examples of them, then it's doing something really good, if it can only solve problems that it has seen before, then it's nothing special, they just overfit a trillion parameters on a comparatively very small dataset.
currentscurrents t1_je13kdr wrote
Reply to comment by londons_explorer in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
True! But also, problems in general are never 100% novel. That's why metalearning works.
You can make up for poor reasoning abilities with lots of experience. This isn't bad exactly, but it makes testing their reasoning abilities tricky.
MohamedRashad t1_je12hzd wrote
Where does the model save after finetuned in the example in the README ?
vladrik t1_je12fzk wrote
Reply to [D] With ML tools progressing so fast, what are some ways you've taken advantage of them personally? by RedditLovingSun
Brainstorming and fast drafting mostly
currentscurrents t1_je12d3k wrote
Reply to comment by ianitic in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
Nobody knows exactly what it was trained on, but there exist several datasets of published books.
>I'm highly surprised I haven't seen any news of authors/publishers suing yet tbh.
They still might. But they don't have a strong motivation; it doesn't really directly impact their revenue because nobody's going to sit in the chatgpt window and read a 300-page book one prompt at a time.
DaBobcat t1_je12b4q wrote
Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
Here OpenAI and Microsoft were evaluating GPT4 on medical problems. In section 6.2 they specifically said that they found strong evidence that it was trained on "popular datasets like SQuAD 2.0 and the Newsgroup Sentiment Analysis datasets". In the appendix section B they explain how they measured whether it saw something in the training data. Point is, I think benchmarks are quite pointless if the training dataset is private and no one can verify that they did not train it on the test set, which they specifically said that in many cases it did
ntaylor- t1_je11vt1 wrote
Reply to comment by was_der_Fall_ist in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
Fairly sure the "final" gpt4 model is still using a generate function that predicts one token at a time. Just the training was good and complicated via RLHF. After training it's not doing any "complicated operations".
thecodethinker t1_je11t4o wrote
Reply to comment by bjj_starter in [D] GPT4 and coding problems by enryu42
Spoken like someone trying to be pedantic
Puzzleheaded_Acadia1 t1_je11l0o wrote
Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
So does that mean that gpt 4 can't think critically? and if not can we make a new kind of ML like LLMs and llama that can think critically and integrated to gpt 4 so it becomes a multimodel that can "see" and think critically.
ntaylor- t1_je11iqf wrote
Reply to comment by astrange in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
But eventually, after RLHF, the gpt4 model is one final fixed model and still presumably uses a generate function that will be predicting next tokens based on the previous, as base gpt models/any autoregressive model does. At least that's what it seems to be doing.
zoontechnicon t1_je10zfj wrote
Reply to comment by Flag_Red in [P] 🎉 Announcing Auto-Analyst: An open-source AI tool for data analytics! 🎉 by aadityaubhat
> Auto-Analyst leverages the OpenAI API
I feel like frontends for OpenAI/ChatGPT do not belong here
sebzim4500 t1_je10iu2 wrote
>Lower-precision fine-tuning (like INT8, INT4)
How would this work? Are the weight internally represented as f16 and then rounded stochastically whenever they are used?
gigglegenius t1_je10itr wrote
Reply to [D] With ML tools progressing so fast, what are some ways you've taken advantage of them personally? by RedditLovingSun
With chatgpt:
- Generate variations of ideas. I type in an idea I have and prompt it to generate variations
- Creative brainstorming for professional work
- Make me laugh by prompting to get some really abstract and surreal story lines
With Bing:
- Search for information in a quick way and have it summarized, it can be hit or miss, but it is getting better at it definitely
thorax t1_je107vs wrote
Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
I'm working on an extreme usage model for leveraging GPT4 to generate code, and it's rather good. Not perfect, but impressive is an understatement.
jabowery t1_je107nj wrote
Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
noraizon t1_je10328 wrote
x0-parametrization has been used for some time now. imo, nothing new under the sun. maybe it's something else I don't see
visarga t1_je0zqxm wrote
Reply to comment by truchisoft in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
ML people spend all day thinking about model limitations and errors, it's only normal that we are not so easily swayed by a non-peer reviewed paper declaring first contact with AGI. Especially from MS who owns 50% of OpenAI
utopiah t1_je0zqae wrote
Reply to comment by fiftyfourseventeen in [D] FOMO on the rapid pace of LLMs by 00001746
Well I just did so please explain why not, genuinely trying to learn. I'd also be curious if you have a list of trained models compared by cost. I only saw some CO2eq order of magnitude equivalent but not rough price estimations so that would help me to get a better intuition as you seem to know more about this.
That being said the point was that you don't necessarily need to train anything from scratch or buy anything to have useful results, you cant rent per hour on cloud and refine existing work, no?
fiftyfourseventeen t1_je0z514 wrote
Reply to comment by nomadiclizard in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
That's exactly what happened here lol, they only deduplicated by exact duplicate text so there was lots of similar data in both sets
gunbladezero t1_je0yqmy wrote
Reply to comment by sebzim4500 in [P] two copies of gpt-3.5 (one playing as the oracle, and another as the guesser) performs poorly on the game of 20 Questions (68/1823). by evanthebouncy
Interesting. It also seems to have spelled her name wrong in BASE64 so that might be a problem. What do you mean by ‘waste the lower layers’?
SFDeltas t1_je0yp5q wrote
u/CatalyzeX_code_bot What are you doing
corkbar t1_je0xct0 wrote
Reply to [D] With ML tools progressing so fast, what are some ways you've taken advantage of them personally? by RedditLovingSun
draft angry letters to people who send me advertisements in the mail, telling them in 5,000 words to never contact me again and to evaluate their life choices
RecoilS14 t1_je0x3ud wrote
Reply to [D] Simple Questions Thread by AutoModerator
I’m a new hobbiest programmer and have spent the last month or so learning python (CS50, Mosh, random Indian guys, etc) and currently also watching the Stanford ML/DL lectures on YouTube.
I have started to learn ML, Pytorch, and some Tensorflow, along with how Tensors and vectors works with ML.
I am wondering if anyone can point me in the direction of other aspects of ML/DL/Neural Networks that I may be missing out on. Perhaps a good series that goes in to length on these subjects via lectures and not just to programming side of it so I can further understand the concepts.
I’m sure there’s lots of things I’m missing out on my journey and I some perspective would be nice.
currentscurrents t1_je14pi5 wrote
Reply to comment by cegras in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
Clearly, the accuracy is going to have to get better before it can replace Google. It's pretty accurate when it knows what it's talking about, but if you go "out of bounds" the accuracy drops off a cliff without warning.
But the upside is that it can integrate information from multiple sources and you can interactively ask it questions. Google can't do that.