Recent comments in /f/MachineLearning

ghostfaceschiller t1_je84v6j wrote

You are talking about two different things here.

  1. Reflexion/ReAct uses another system (like LangChain) to allow the bot to genuinely loop back over previous results to try and improve them. This indeed ends up getting you better results or outcomes in the end

  2. You can also simply tell the bot, in ur prompt, something like "before you respond, review your first draft for errors, and only output your second draft". Now, this is not what the bot will actually do, but regardless, this will often result in higher quality output, presumably bc in the training data that kind of phrase is typically associated with a certain type of answer (IE: better answers)

5

Necessary-Meringue-1 t1_je84su5 wrote

Of course it has, but those are hard fought gains that are primarily results of WWI, WWII, and the early phases of the Cold War, not productivity gains.

There is no natural law that productivity gains get handed down. Just compare the years 1950-1970 in the US, where life for the average worker improved greatly, to the 1980s onward, since when we've been in a downward trend. There's steady productivity gains over all that.

2

bluenigma t1_je84g9s wrote

I guess I'm still wondering that if you can generate a runtime type check for an arbitrary Typescript type at compile time, why is this not a builtin Typescript language feature?

Edit: Took a quick look at the code and it it looks to me like there's definitely limitations on what return types are supported. Looks like it can handle basic aliases and record types, but throws on a lot of other stuff?

Should probably be documented somewhere.

3

Tostino t1_je847jg wrote

Literally just worked through this today manually as a proof of concept, using the LLM to augment the DB schema with comments describing any relevant info or corner cases. I'm essentially just manually feeding it as context to my prompts when I need to know something related to that set of tables, but it seems pretty powerful. Automating this is going to be nuts.

5

sdmat t1_je83jw4 wrote

Objectively prove? Nothing. But subjectively there is a stark difference in the quality of suggestions and apparent depth of understanding from earlier LLMs. E.g. 3.5 suggested using jeans for radiation shielding "because denim is a thick material".

I did try a web search and directly asking the model for references. Unsurprisingly jeans for Mars colonization doesn't seem to be an existing concept, so it's almost certainly not in the training set.

13

athos45678 t1_je82thk wrote

So as far as set up goes, you just need to: “”” Git clone https://github.com/lxe/simple-llama-finetuner Cd simple-llama-finetuner Pip install -r requirements.txt Python app.py ## if you’re on a remote machine (Paperspace is my go to) then you may need to edit the last line of this script to set ‘share=True’ in the launch args “””

Then you should get a link for the gradio web app. Copy and paste the code samples in the format described before in the input text box. It will look something like this:

“”” Write a code snippet that sorts a function Def sort(arr):

  Return arr.sorted()

Some other code snippet input

Some answer

Etc. “””

Edit: I’m drinking with friends sorry i can’t format better. Single line break between prompt and observed correct response, double line break between prompt instances.

3

antonivs t1_je82r3j wrote

Well, I do need to be a bit vague. The main DSL has about 50 instructions corresponding to actions to be performed. There's also another different sub-DSL, with about 25 instructions, to represent key features of the domain model, that allows particular scenarios to be defined and then recognized when executing.

Both DSLs are almost entirely linear and declarative, so there's no nested structure, and the only control flow is a conditional branch instruction in the top-level DSL, to support conditional execution and looping. The UI essentially acts as a wizard, so that users don't have to deal with low-level detail.

There are various ideas for the GPT model, including suggesting instructions when creating a program, self-healing when something breaks, and finally generating programs from scratch based on data that we happen to already collect anyway.

NLP will probably end up being part of it as well - for that, we'd probably use the fine-tuning approach with an existing language model as you suggested.

2

xander76 OP t1_je81147 wrote

So usually, yes, that’s true about TypeScript. Types are removed by the compiler.

But we wrote a typescript & Babel compiler plug-in, which allows us to replace the imaginary function with whatever code we want. So we replace the imaginary function with code that includes a run-time type check for the appropriate return type from the TypeScript definition. Does that make sense?

1

lambertb t1_je802eu wrote

I agree the survey study is nothing close to being definitive. And it does smack it marketing. Still, my own experience suggests that these tools will be transformative. At the same time, I’ve gotten lost down an AI rabbit hole where it would have been more efficient for me to just do it myself. On balance though, my assessment is that these are already very helpful tools, and they’ll only get better.

1

EquipmentStandard892 t1_je7xyd9 wrote

I read your paper and was reasoning about something interesting, I wonder if it is possible to use this method to fine-tune the model to be able to query a vector database without harming it's context length limitations. It may sound stupid but humans don't just say things, I'm not talking about CoT especially but I was curious if as our brains do, use another instance of the same LLM to generate little hypothesis about the ongoing conversation, and store those on a vector space database, then use those generated thesis during reasoning. We as humans have also an limited cognitive memory, and how do we overcome this ? Great paper btw.

30