Recent comments in /f/MachineLearning
WarAndGeese t1_jdyi94w wrote
Reply to [D] FOMO on the rapid pace of LLMs by 00001746
You are thinking about it backwards. This stuff is happening now and you are a part of it. You are among the least of people who is "missing out", you are in the centre of it as it is happening.
aeternus-eternis t1_jdyhle0 wrote
Reply to comment by tvetus in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
Other than math, isn't everything just mostly true?
ObiWanCanShowMe t1_jdyh4wm wrote
Reply to [D] FOMO on the rapid pace of LLMs by 00001746
Utilizing the models and all the upcoming amazing things is going to be 10x more valuable than getting your hands dirty trying to make one on your own.
You won't get replaced by AI, you will get replaced by someone who knows how to use the AI.
Borrowedshorts t1_jdygg1l wrote
Reply to comment by gunbladezero in [P] two copies of gpt-3.5 (one playing as the oracle, and another as the guesser) performs poorly on the game of 20 Questions (68/1823). by evanthebouncy
I would have never guessed Sofia Coppola no matter how many questions you gave me, so I don't know if it performed that poorly.
ZestyData t1_jdyfg0y wrote
Reply to comment by Hnriek in [D] Is French the most widely used language in ML circles after English? If not, what are some useful (natural) languages in the field of machine learning? by Subject_Ad_9680
I wasn't aware of Japan having a particularly disconnected tech sphere from the West like China does. Where China has its own independent platforms, technologies, separate SOTAs and completely disjointed research (until the recent 5 years where they've really started converging and borrowing from each other).
While Japan has tech companies, most of their research is coming out of their global offices, and they really are global even when based in Japan. Sony aren't publishing papers in Japanese, they're doing so to Western conferences in English.
Whereas China had its own parallel FAANG equivalent tech giants developing their own versions of Amazon, Google, and Facebook's tech supremacy & its constituent ML advances.
All this to say that Japan engaged in the Western economy a lot more, and subsequently its tech companies engaged in the Western pool of talent, science, and communication a lot more. Meanwhile China had its own bubble until very very recently, and thus a lot of the world's unique & innovative ML has been conducted in Mandarin.
[deleted] t1_jdyf5pe wrote
Reply to [D] FOMO on the rapid pace of LLMs by 00001746
[removed]
nmfisher t1_jdyeyit wrote
Reply to [D] FOMO on the rapid pace of LLMs by 00001746
IMO the area most ripe for picking is distilling larger pretrained models into smaller, task-specific ones. Think extracting a 30mb LM from Lllama that is limited to financial terminology.
There's still a huge amount of untapped potential.
ghostfaceschiller t1_jdyerkp wrote
Reply to comment by nxqv in [D] FOMO on the rapid pace of LLMs by 00001746
> Yan LeCun
That dude is becoming straight-up unhinged on Twitter
StellaAthena t1_jdydjg4 wrote
Reply to comment by OkWrongdoer4091 in [D] ICML 2023 Reviewer-Author Discussion by zy415
I have four papers. Two have no comments, one has all three reviewers say “thanks but I’ll keep my score” with no further elaboration. The 7/7/2 paper had the 2 and one of the 7s argue and the third reviewer remained silent. All tolled, 5/12 responded.
bartvanh t1_jdyd6om wrote
Reply to comment by was_der_Fall_ist in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
Ugh, yes it's so frustrating to see people not realizing this bit all the time. And also kind of painful to imagine that (presumably - correct me if I'm wrong) all those internal "thoughts" are probably discarded after each word, only to be painstakingly reconstructed almost identically for predicting the next word.
ThePseudoMcCoy t1_jdyd4xm wrote
Reply to comment by Disastrous_Elk_6375 in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
When I do long prompts more than a sentence or 2 in alpaca in windows it crashes. Curious how you did these long prompts?
Edit:nevermind I recompiled it in c++ with the fix listed on GitHub
Content-Dog-1985 t1_jdyd0bz wrote
Reply to comment by zy415 in [D] ICML 2023 Reviewer-Author Discussion by zy415
As for me, I got no ICLR 2022 replies but several comments from ICML 2023. It all depends on your luck maybe...
tt54l32v t1_jdyc1h3 wrote
Reply to comment by WarAndGeese in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
So the second app might would fare better leaning towards search engine instead of LLM but some LLM would ultimately be better to allow for less precise matches of specific set of searched words.
Seems like the faster and more seamless one could make this, the closer we get to agi. To create and think it almost needs to hallucinate and then check for accuracy. Is any of this already taking place in any models?
light24bulbs t1_jdybyi9 wrote
Reply to [D] FOMO on the rapid pace of LLMs by 00001746
I'm just going in deep. These are FUN
Hnriek t1_jdyb8e9 wrote
Reply to comment by ZestyData in [D] Is French the most widely used language in ML circles after English? If not, what are some useful (natural) languages in the field of machine learning? by Subject_Ad_9680
Obv Mandarin, but aybe only 1 OOM ahead of Japanese? Simply approximating by population size & comaprable level of English skills
r_linux_mod_isahoe t1_jdyapx4 wrote
The serving part of the solution does some clever hashing it seems
SkinnyJoshPeck t1_jdyaayk wrote
Reply to comment by gunbladezero in [P] two copies of gpt-3.5 (one playing as the oracle, and another as the guesser) performs poorly on the game of 20 Questions (68/1823). by evanthebouncy
> {"role":"user","content":"Ok! Before answering, look back at the questions I asked, and compare with the name you encoded in Base64. Tell me if you made any mistakes."},{"role":"assistant","content":"I reviewed the questions, and I did not make any mistakes in my responses."},
this kind of question is kind of unfair i think for language models. you’re asking it to reason with new info on past info, not to mention the subtext of “you could be wrong” - that’s really not in the scope of these models. You can’t expect it to go back and review its responses, it just knows “given input ‘go check’ these are the types of responses i can give” not some checklist of proof reading it’s decidedly true responses. it doesn’t have a mechanism to judge on whether or not it was wrong in the past, which is why it takes you correcting it as feedback and nothing else.
[deleted] t1_jdy8zqw wrote
Reply to comment by [deleted] in [D] FOMO on the rapid pace of LLMs by 00001746
[removed]
Sirisian t1_jdy84ph wrote
Reply to [D] 3d model generation by konstantin_lozev
https://dreambooth3d.github.io/
https://ryanpo.com/comp3d/
https://ku-cvlab.github.io/3DFuse/
https://lukashoel.github.io/text-to-room/
https://zero123.cs.columbia.edu/
There's so many papers released every week. If you used https://www.connectedpapers.com/ you'd probably find more. Some of these released at the same time mind you. So many teams are working on nearly identical projects.
[deleted] t1_jdy83bi wrote
Reply to comment by nxqv in [D] FOMO on the rapid pace of LLMs by 00001746
[deleted]
rshah4 t1_jdy7o2h wrote
Reply to comment by dimem16 in [D] FOMO on the rapid pace of LLMs by 00001746
Here is my video: https://youtu.be/YKCtbIJC3kQ
Here is the blog post its based on: https://www.philschmid.de/fine-tune-flan-t5-peft
Efficient Large Language Model training with LoRA and Hugging Face
belikeron t1_jdy7jrv wrote
Reply to [D] FOMO on the rapid pace of LLMs by 00001746
I mean that's true, but it's not worth losing sleep over either. Yes a disruptive technology based on scalability will always make decades of research look like a waste of time to the lay person.
It also would be impossible without the insights gained from those decades of research. It is the same with galactic travel.
The first mission to the nearest star will not be the first ones to get there. We will have a colony waiting on them to arrive at the objectively slow almost the speed of light. The technology the colonists used to get there in 20 minutes wouldn't have happened without all of the advances made just to get that first lemon into space.
That's my two cents.
dimem16 t1_jdy7aja wrote
Reply to comment by rshah4 in [D] FOMO on the rapid pace of LLMs by 00001746
Thanks for your insight. Could you share the link to the video please?
EarthquakeBass t1_jdy7796 wrote
Reply to comment by SmellElectronic6656 in [D] Can we train a decompiler? by vintergroena
It’s very useful for malware analysis. In malware it’s all about hiding your tracks. Clearing up the intent of even just some code helps white hats a lot. Example: Perhaps it inserts some magic bytes into a file to exploit an auto run vulnerability. ChatGPT might recognize that context from its training data much more quickly.
WarAndGeese t1_jdyi9nm wrote
Reply to [D] FOMO on the rapid pace of LLMs by 00001746
I think a lot of people have falsely bought the concept that their identity is their job, because there is such material incentive for that to be the case.
Also note that people seem to like drama, so they egg on and encourage posts about people being upset or emotional, whereas both, those cases aren't that representative, and those cases themselves are exaggerated for the sake of that drama.