Recent comments in /f/MachineLearning
_sbmaruf t1_je8iuvl wrote
Reply to comment by WarmSignificance1 in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
We just released the dataset last week. We are in the process of training some autoregressive models.
GFrings t1_je8i7ro wrote
Reply to comment by Technical-Vast1314 in [R] You Only Segment Once: Towards Real-Time Panoptic Segmentation [CVPR 2023] by Technical-Vast1314
Isn't semantic segmentation made redundant by the instance segmentation? Or is there a difference in coverage for the two tasks, in terms of the ground truth labels?
lgastako t1_je8i6dw wrote
Reply to comment by WokeAssBaller in [D] The best way to train an LLM on company data by jaxolingo
Not generally very well.
elbiot t1_je8i0i2 wrote
Reply to comment by LetGoAndBeReal in [D] The best way to train an LLM on company data by jaxolingo
The second link says fine tuning is a substitute for lengthy prompts, including putting more into it than can fit in the longest prompt. Prompts are a way to give the model new information. What is your definition of knowledge that isn't something you can put into a prompt?
ghostfaceschiller t1_je8habj wrote
Reply to comment by EquipmentStandard892 in [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama
Could you extrapolate what you mean here? I'm not sure I'm following
VelvetyPenus t1_je8gktg wrote
First person to use AI to embezzle majority of company profits arrested. Convict used Reddit to ask how to turn company data into AI dataset.
m98789 t1_je8gdwb wrote
Reply to comment by MadScientist-1214 in [D] Improvements/alternatives to U-net for medical images segmentation? by viertys
CVPR is not a journal
huyouare t1_je8fby1 wrote
Reply to comment by Appropriate_Ant_4629 in [D] The best way to train an LLM on company data by jaxolingo
I was wondering how this relates to retrieval or SQL queries but it sounds like you’re suggesting that OP finetunes on their dataset regularly. Might be good to try in combination with retrieval, but how would you represent the tabular data as training examples?
r_linux_mod_isahoe t1_je8f4nc wrote
Reply to [P] Imaginary programming: implementation-free TypeScript functions for GPT-powered web development by xander76
RIP this sub. Was nice knowing ya all
netham91 t1_je8dxrt wrote
Reply to comment by athos45678 in [D] The best way to train an LLM on company data by jaxolingo
Thanks
light24bulbs t1_je8d6bh wrote
Reply to comment by LetGoAndBeReal in [D] The best way to train an LLM on company data by jaxolingo
Continuous retraining is something else.
I'll be training llama soon, I'll get back to you with how it goes.
xander76 OP t1_je8d46w wrote
Reply to comment by bluenigma in [P] Imaginary programming: implementation-free TypeScript functions for GPT-powered web development by xander76
Yep, I should have been clearer that (at least for now) it only supports JSON-compatible types and doesn’t support type aliases or classes (although we have a prototype version that does). The limitations are documented at https://imaginary.dev/docs/writing-an-imaginary-function .
hitechnical t1_je8d0ev wrote
Reply to comment by [deleted] in [D] Simple Questions Thread by AutoModerator
I heard Standford’s LLM can run in smaller devices. Pls google.
HangOutWithMyYangOut t1_je8b3d6 wrote
Reply to comment by xander76 in [P] Imaginary programming: implementation-free TypeScript functions for GPT-powered web development by xander76
Awesome thanks so much I'll be diving into it tomorrow
LetGoAndBeReal t1_je8akb1 wrote
Reply to comment by light24bulbs in [D] The best way to train an LLM on company data by jaxolingo
I would agree with that last statement. You think you understand this, but you don’t seem to understand what does and doesn’t happen during fine-tuning or to realize that the problem of adding knowledge to LLMs is a notoriously difficult problem that ongoing research is trying to solve.
Try looking at some of the research: https://openreview.net/forum?id=vfsRB5MImo9
Or read what OpenAI says fine-tuning accomplishes: https://platform.openai.com/docs/guides/fine-tuning
Or, better yet, try actually getting a LLM to learn new facts by fine-tuning it. Then you will understand.
[deleted] t1_je8ajeo wrote
Reply to comment by Hands0L0 in [D] The best way to train an LLM on company data by jaxolingo
[removed]
Appropriate_Ant_4629 t1_je89aho wrote
Databricks announced this week that they're trying to make it easy:
Haven't tried it yet, but we will soon.
machineko t1_je88wj9 wrote
Reply to [D] Training a 65b LLaMA model by Business-Lead2679
I'm working on an open source library focused on resource-efficient fine-tuning methods called xTuring: https://github.com/stochasticai/xturing
Here's how you would perform int8 LoRA fine-tuning in three lines:
python: https://github.com/stochasticai/xturing/blob/main/examples/llama/llama_lora_int8.py
colab notebook: https://colab.research.google.com/drive/1SQUXq1AMZPSLD4mk3A3swUIc6Y2dclme?usp=sharing
Of course the Colab still only works with smaller models. In the example above, 7B required 9G VRAM.
machineko t1_je888dc wrote
Why not use open source models. Especially it seems like you are trying not to sell the model for commercial purposes, you can easily replace it with open source models. Also, for retrieval-augmented generation, smaller models can be very effective.
Warhouse512 t1_je86y92 wrote
Mask2Former?
machineko t1_je86hwt wrote
Reply to comment by ortegaalfredo in [D] llama 7b vs 65b ? by deck4242
What GPUs are you using to run them? Are you using any compression (i.e. quantization)?
light24bulbs t1_je863pu wrote
Reply to comment by WokeAssBaller in [D] The best way to train an LLM on company data by jaxolingo
Yeah, he doesn't get it. That's ok though, but to be wrong and be sure about it is a bummer
SatisfactionFine6298 t1_je85d51 wrote
A personal language model (PLM) would be way better for that than an LLM. There are models you can train like www.personal.ai
kromem t1_je84zam wrote
Reply to comment by Tostino in [D] The best way to train an LLM on company data by jaxolingo
> Automating this is going to be nuts.
Yes, yes it is.
elbiot t1_je8iym9 wrote
Reply to [D] Improvements/alternatives to U-net for medical images segmentation? by viertys
Looks like this was trained on just 150 x-rays and does very well: https://paperswithcode.com/paper/xnet-a-convolutional-neural-network-cnn
Edit: did you look for pre-existing solutions? This was like the second google result. If I were you I'd be looking for public datasets I could use for pretraining and then finetune on my data