Recent comments in /f/MachineLearning

xander76 OP t1_je8d46w wrote

Yep, I should have been clearer that (at least for now) it only supports JSON-compatible types and doesn’t support type aliases or classes (although we have a prototype version that does). The limitations are documented at https://imaginary.dev/docs/writing-an-imaginary-function .

1

LetGoAndBeReal t1_je8akb1 wrote

I would agree with that last statement. You think you understand this, but you don’t seem to understand what does and doesn’t happen during fine-tuning or to realize that the problem of adding knowledge to LLMs is a notoriously difficult problem that ongoing research is trying to solve.

Try looking at some of the research: https://openreview.net/forum?id=vfsRB5MImo9

Or read what OpenAI says fine-tuning accomplishes: https://platform.openai.com/docs/guides/fine-tuning

Or, better yet, try actually getting a LLM to learn new facts by fine-tuning it. Then you will understand.

4

machineko t1_je88wj9 wrote

I'm working on an open source library focused on resource-efficient fine-tuning methods called xTuring: https://github.com/stochasticai/xturing

Here's how you would perform int8 LoRA fine-tuning in three lines:

python: https://github.com/stochasticai/xturing/blob/main/examples/llama/llama_lora_int8.py
colab notebook: https://colab.research.google.com/drive/1SQUXq1AMZPSLD4mk3A3swUIc6Y2dclme?usp=sharing

Of course the Colab still only works with smaller models. In the example above, 7B required 9G VRAM.

12