Recent comments in /f/deeplearning

GhostingProtocol OP t1_jb88x11 wrote

I’d buy used anyways, kinda a hot take but I refuse to give NVIDIA money :P

I’m thinking of going with 3090 for 900$ or 3080 for 650$ (I can get FE for 750$ which would be pretty epic)

Got any advice? I don’t like that the 3080 only has 10GB vram. But 3080 is already pretty much overkill for anything I’d use it for other than deep learning. Kinda on the fence here tbh

1

incrediblediy t1_jb5dzqa wrote

This is when they were running individually on full 16x PCIE 4.0, can be expected with TFLOPS (3x) as well. (i.e. I have compared times when I had only 3060 vs 3090 on the same slot, running model on a single GPU each time)

I don't do much training on 3060 now, just connected to monitors etc.

I have changed the batch sizes to suit 24 GB anyway as I am working with CV data. Could be bit different with other types of models.

3060 = FP32 (float) 12.74 TFLOPS (https://www.techpowerup.com/gpu-specs/geforce-rtx-3060.c3682)
3090 = FP32 (float) 35.58 TFLOPS (https://www.techpowerup.com/gpu-specs/geforce-rtx-3090.c3622)

I must say 3060 is a wonderful card and helped me a lot until I found this ex-mining 3090. Really worth for the price with 12 GB VRAM.

1

bartzer t1_jb54163 wrote

I suggest to get the 3070 (or similar) for prototyping/testing your ideas. You can reduce VRAM usage by scaling down your data or training with smaller batch size etc. to see if your concept makes sense.

At some point you may run into VRAM or other hardware limitation issues. (you can't train with larger images for example). If that happens you can run training on colab or some other high performance hardware offer.

1

fundamental_entropy t1_jb4bu9u wrote

For the first question , we are moving towards not having to design such pipelines , ideally we will have a library which will do the model sharding or parallel computation for us. Look at parallelformers which worked for some big models(11B) i tried. Why i think this is going to happen is , 3 years back distributed training used to be a big black box, horovod, pytorch distributed training and TPUs are the only solution but right now no one designs such peipelines anymore ,everyone uses deepspeed. It has implementations of all known techniques(zero , cpu offloading etc). So if you are not one of these computation/data engineers , i suggest to watch out for such libraries.

1

tsgiannis t1_jb40tzg wrote

3070 should be much much faster than Colab and you have the added bonus of working with full debugging capabilities (PyCharm/Spyder...etc)

Even my 2nd hand 3050 is much faster than Colab...but it is always helpful to have a 2nd machine...so 3070 AND Colab

1