Recent comments in /f/MachineLearning
Necessary-Meringue-1 t1_je84su5 wrote
Reply to comment by slaweks in [D] FOMO on the rapid pace of LLMs by 00001746
Of course it has, but those are hard fought gains that are primarily results of WWI, WWII, and the early phases of the Cold War, not productivity gains.
There is no natural law that productivity gains get handed down. Just compare the years 1950-1970 in the US, where life for the average worker improved greatly, to the 1980s onward, since when we've been in a downward trend. There's steady productivity gains over all that.
bluenigma t1_je84g9s wrote
Reply to comment by xander76 in [P] Imaginary programming: implementation-free TypeScript functions for GPT-powered web development by xander76
I guess I'm still wondering that if you can generate a runtime type check for an arbitrary Typescript type at compile time, why is this not a builtin Typescript language feature?
Edit: Took a quick look at the code and it it looks to me like there's definitely limitations on what return types are supported. Looks like it can handle basic aliases and record types, but throws on a lot of other stuff?
Should probably be documented somewhere.
Tostino t1_je847jg wrote
Reply to comment by kromem in [D] The best way to train an LLM on company data by jaxolingo
Literally just worked through this today manually as a proof of concept, using the LLM to augment the DB schema with comments describing any relevant info or corner cases. I'm essentially just manually feeding it as context to my prompts when I need to know something related to that set of tables, but it seems pretty powerful. Automating this is going to be nuts.
currentscurrents OP t1_je83z1p wrote
Reply to comment by midasp in [R] The Debate Over Understanding in AI’s Large Language Models by currentscurrents
I think these are all ideas from the internet, but it did understand that they would be appropriate for the task of making jeans useful on mars.
It seems to have understood the instructions and then pulled relevant information out of its associative memory to build the response.
machineko t1_je83m8x wrote
Reply to comment by LetGoAndBeReal in [D] The best way to train an LLM on company data by jaxolingo
Unsupervised fine-tuning (or extending the pre-training) with additional data will work. Of course, how to get it to learn new information effectively is a challenge but not impossible.
sdmat t1_je83jw4 wrote
Reply to comment by midasp in [R] The Debate Over Understanding in AI’s Large Language Models by currentscurrents
Objectively prove? Nothing. But subjectively there is a stark difference in the quality of suggestions and apparent depth of understanding from earlier LLMs. E.g. 3.5 suggested using jeans for radiation shielding "because denim is a thick material".
I did try a web search and directly asking the model for references. Unsurprisingly jeans for Mars colonization doesn't seem to be an existing concept, so it's almost certainly not in the training set.
athos45678 t1_je82thk wrote
Reply to comment by netham91 in [D] The best way to train an LLM on company data by jaxolingo
So as far as set up goes, you just need to: “”” Git clone https://github.com/lxe/simple-llama-finetuner Cd simple-llama-finetuner Pip install -r requirements.txt Python app.py ## if you’re on a remote machine (Paperspace is my go to) then you may need to edit the last line of this script to set ‘share=True’ in the launch args “””
Then you should get a link for the gradio web app. Copy and paste the code samples in the format described before in the input text box. It will look something like this:
“”” Write a code snippet that sorts a function Def sort(arr):
Return arr.sorted()
Some other code snippet input
Some answer
Etc. “””
Edit: I’m drinking with friends sorry i can’t format better. Single line break between prompt and observed correct response, double line break between prompt instances.
slaweks t1_je82s9s wrote
Reply to comment by kaisear in [D] FOMO on the rapid pace of LLMs by 00001746
"CEOs will be replaced before machine learning engineers" - that's very naive :-)
antonivs t1_je82r3j wrote
Reply to comment by Craksy in [D] FOMO on the rapid pace of LLMs by 00001746
Well, I do need to be a bit vague. The main DSL has about 50 instructions corresponding to actions to be performed. There's also another different sub-DSL, with about 25 instructions, to represent key features of the domain model, that allows particular scenarios to be defined and then recognized when executing.
Both DSLs are almost entirely linear and declarative, so there's no nested structure, and the only control flow is a conditional branch instruction in the top-level DSL, to support conditional execution and looping. The UI essentially acts as a wizard, so that users don't have to deal with low-level detail.
There are various ideas for the GPT model, including suggesting instructions when creating a program, self-healing when something breaks, and finally generating programs from scratch based on data that we happen to already collect anyway.
NLP will probably end up being part of it as well - for that, we'd probably use the fine-tuning approach with an existing language model as you suggested.
anandm21096 t1_je8255p wrote
hey there, can you send me the invite?
slaweks t1_je8220m wrote
Reply to comment by Necessary-Meringue-1 in [D] FOMO on the rapid pace of LLMs by 00001746
Really? Average worker life has not imporoved over last 200 years?
OSeady t1_je81gws wrote
Reply to [D] Training a 65b LLaMA model by Business-Lead2679
Contact Redmond.ai they can hook you up.
xander76 OP t1_je81147 wrote
Reply to comment by bluenigma in [P] Imaginary programming: implementation-free TypeScript functions for GPT-powered web development by xander76
So usually, yes, that’s true about TypeScript. Types are removed by the compiler.
But we wrote a typescript & Babel compiler plug-in, which allows us to replace the imaginary function with whatever code we want. So we replace the imaginary function with code that includes a run-time type check for the appropriate return type from the TypeScript definition. Does that make sense?
midasp t1_je80uot wrote
Reply to comment by sdmat in [R] The Debate Over Understanding in AI’s Large Language Models by currentscurrents
And exactly what does that prove?
WokeAssBaller t1_je80fbg wrote
Reply to comment by lambertb in [D] GPT4 and coding problems by enryu42
They will absolutely reshape the world in the next 5 years, all I'm saying is in its current state I haven't found it helpful. I'm sure in the next couple of years it's the main thing I will use
lambertb t1_je802eu wrote
Reply to comment by WokeAssBaller in [D] GPT4 and coding problems by enryu42
I agree the survey study is nothing close to being definitive. And it does smack it marketing. Still, my own experience suggests that these tools will be transformative. At the same time, I’ve gotten lost down an AI rabbit hole where it would have been more efficient for me to just do it myself. On balance though, my assessment is that these are already very helpful tools, and they’ll only get better.
brandonZappy t1_je7zwfg wrote
Reply to [D] Training a 65b LLaMA model by Business-Lead2679
What QA dataset are you using?
bluenigma t1_je7z1k1 wrote
Reply to comment by xander76 in [P] Imaginary programming: implementation-free TypeScript functions for GPT-powered web development by xander76
Can you actually check the typescript-defined return type in that way nowadays? I thought that information was unavailable at runtime.
How does this handle type imports or advanced types?
bubudumbdumb t1_je7yrmq wrote
In the last few days someone posted on hacker news about a system allowing the integration of a gpt with a postgress database
WokeAssBaller t1_je7yeux wrote
Reply to comment by LetGoAndBeReal in [D] The best way to train an LLM on company data by jaxolingo
This is a fine approach but fine tuning can and does add knowledge to models, please quit saying that
WokeAssBaller t1_je7y7ij wrote
Reply to comment by light24bulbs in [D] The best way to train an LLM on company data by jaxolingo
Lol this guy doesn’t understand ML, you are absolutely adding knowledge to the model
WokeAssBaller t1_je7y09s wrote
Reply to comment by LetGoAndBeReal in [D] The best way to train an LLM on company data by jaxolingo
Huh? I think that depends on the fine tuning you are talking about. Fine tuning can absolutely add knowledge to a model
EquipmentStandard892 t1_je7xyd9 wrote
Reply to [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama
I read your paper and was reasoning about something interesting, I wonder if it is possible to use this method to fine-tune the model to be able to query a vector database without harming it's context length limitations. It may sound stupid but humans don't just say things, I'm not talking about CoT especially but I was curious if as our brains do, use another instance of the same LLM to generate little hypothesis about the ongoing conversation, and store those on a vector space database, then use those generated thesis during reasoning. We as humans have also an limited cognitive memory, and how do we overcome this ? Great paper btw.
ustainbolt t1_je7xtcw wrote
Reply to comment by wrossmorrow in [D] Training a 65b LLaMA model by Business-Lead2679
I love lambda. More reliable than vast.ai, and WAY cheaper than AWS/GCP/Azure.
ghostfaceschiller t1_je84v6j wrote
Reply to [Discussion] IsItBS: asking GPT to reflect x times will create a feedback loop that causes it to scrutinize itself x times? by RedditPolluter
You are talking about two different things here.
Reflexion/ReAct uses another system (like LangChain) to allow the bot to genuinely loop back over previous results to try and improve them. This indeed ends up getting you better results or outcomes in the end
You can also simply tell the bot, in ur prompt, something like "before you respond, review your first draft for errors, and only output your second draft". Now, this is not what the bot will actually do, but regardless, this will often result in higher quality output, presumably bc in the training data that kind of phrase is typically associated with a certain type of answer (IE: better answers)