Recent comments in /f/MachineLearning

SnooMarzipans3021 t1_jdzehws wrote

Hello, does anyone have suggestions on how to do guided image upsacling?
Basically I have 6000x6000 image which im unable to load in network because of GPU memory. I had this idea of resizing the image to something like 1500x1500 and then upscaling it back to 6000x6000. But I have to do it without losing details and dont want to use super resolution models (im ofraid they will hallucinate and inpaint). If I already have the ground truth resolution, how can I use it to guide the upscaling?

1

FinancialElephant t1_jdzdm7x wrote

I was way more impressed by mu zero when it came out. I feel crazy for not being that impressed by these LLMs. I do think they are changing the world, but I don't see this as some huge advancement in ML as much as an advanced ingestion and regurgitation machine. All the "intelligence" is downstream from the humans that generated the data.

Honestly I think the reason it made a huge splash is because the RLHF fine tuning made the models especially good at fooling humans. It feels like more of a hack than a big step in AI. My biggest worry is people will expect too much out of these things, too soon. There seems to be a lot of fear and exuberance going around.

11

sebzim4500 t1_jdzczty wrote

I think you're forcing the model to waste the lower layers on each step decoding that base64 string. Let it output the word normally, and you would probably see much better performance. Just don't look at the first output, if you want to still play it like a game.

3

Disastrous_Elk_6375 t1_jdzccfq wrote

Is 0shot really the strength of GPT and especially chatGPT? From my (limited) experience interacting with chatGPT, the value seems to come from prompt understanding and adaptation to my next prompts / corrections. In the context of an assistant, I'm ok with priming the conversation first, if it can handle the subsequent requests better.

1

andreichiffa t1_jdzcbln wrote

Depends on which hardware you have. A rule of thumb is that if you want to be efficient, you need about 3x the model size in VRAM to store optimizers state, plus some headroom for data.

You also need to use float for training, due to stability issues. So unless your GPU supports float8, double the RAM.

Realistically, if you have an RTX 4090, you can go up to 6-7B models (Bloom-6B, GPT-j, …). Anything below, and I would aim at 2.7B models (GPT-neo).

I would avoid LLaMA family due to how you get access to pretrained model weights, for liability, and stay with FOSS. In the latter case you can contribute back and gain some visibility this way, assuming you want some.

4

shiuidu t1_jdzc9o6 wrote

I have a project I want to build a natural language interface to. Is there a simple way to do this? It's a .net project but I have a python project I want to do the same thing for?

1