Recent comments in /f/deeplearning

RichardBJ1 t1_ix9fcaw wrote

His book has some nice examples, works well. Really as the other answer has said though you need to follow your interests and apply those examples to something that interests you. Another idea is Kaggle; you can clone others code quite legitimately and understand what they were up to. So many examples on Kaggle you’ll surely find something that fits your interests!! Good luck

5

Nerveregenerator OP t1_ix92czu wrote

Reply to comment by chatterbox272 in GPU QUESTION by Nerveregenerator

Ok, thanks I think that clears up the drawbacks. I’d have to check which motherboard I’m using now, but generally would you expect a 3090 to be compatible with the motherboard that works with a 1080ti? Thanks

1

Dexamph t1_ix7onhf wrote

Reply to comment by Star-Bandit in GPU QUESTION by Nerveregenerator

I think you way overestimated K80 performance when my 4GB GTX 960 back in the day could trade blows with a K40, which was a bit more than half a K80. In a straight memory bandwidth fight, like Transformer model training, the 1080Ti is going to win hands down even if you have perfect scaling acorss both GPUs on the K80 and that's assuming it doesn't get hamstrung by the ancient Kepler architecture in any way at all.

2

chatterbox272 t1_ix7mx5j wrote

>the cost/performance ratio for the 1080's seems great..

Only if your time is worthless, your ongoing running costs can be ignored, and expected lifespan is unimportant.

Multi-GPU instantly adds a significant amount of complexity that needs to be managed. It's not easy to just "hack it out" and have it work under multi-GPU, you either need to use frameworks that provide support (and make sure nothing you want to do will break that support), or you need to write it yourself. This is time and effort you have to spend that you otherwise wouldn't with a single GPU. You'll have limitations with respect to larger models, as breaking up a model over multiple GPUs (model parallelism) is way more complicated than breaking up batches (data parallelism). So models >11GB for a single element are going to be impractical.

You'll have reduced throughput unless you have a server, since even HEDT platforms are unlikely to give you 4 PCIe Gen3 x16 slots. You'll be on x8 slots at best, and most likely on x4 slots. You're going to be pinned to much higher end parts here, spending more on the motherboard/cpu than you would need to for a single 3090.

It's also inefficient as all buggery. The 3090 has a TDP of 350W, the 1080Ti has 250W. That means for the same compute you're drawing roughly (TDP is a reasonable but imperfect stand in for true power draw) 3x the power for that compute. That will drastically increase the running cost of the system. Also a more expensive power supply and possibly even needing to upgrade the wall socket to allow you to draw that much power (4 1080Ti to me means a 1500W PSU minimum, which would require a special 15A socket in Australia where I live).

You're also buying cards that are minimum 3 years old. They have seen some amount of use, and use in a time where GPU mining was a big deal (so many of the cards out there were pushed hard doing that). The longer a GPU has been out of your possession, the less you can rely on how well it was kept. The older arch will also be sooner dropped for support. Kepler was discontinued last year, so we have Maxwell and then Pascal (where the 10 series lies). Probably a while away, but a good bit sooner than Ampere (which has to wait through Maxwell, Pascal, Volta, and Turing before it hits the chopping block).

TL; DR:
Pros: Possibly slightly cheaper upfront
Cons: Requires more expensive hardware to run, higher running cost, shorter expected lifespan, added multi-GPU complexity, may not actually be compatible with your wall power.

TL; DR was TL; DR: Bad idea, don't do it.

7

RichardBJ1 t1_ix7ficv wrote

I think if you get even similar performance with one card versus 4 cards the former is going to be far less complex to set up!? Just the logistics of that sounds a nightmare.

2

incrediblediy t1_ix7czdr wrote

Reply to comment by Nerveregenerator in GPU QUESTION by Nerveregenerator

> 4 1080s combined will get me 1.5x throughout as a 3090 with FP32 training. FP16 seems to yield a 1.5x speed up for the 3090 for training.

I think that's when only comparing CUDA cores without Tensor cores, anyway you can't merge VRAM together for large models

3

incrediblediy t1_ix7cqrg wrote

> 4 1080Ti's or 1 3090 > ebay for 200 bucks

you can also get an used 3090 for the same price of 4*200, also you can use 24 GB VRAM for training larger models

5

Star-Bandit t1_ix7avv6 wrote

Reply to comment by Star-Bandit in GPU QUESTION by Nerveregenerator

Actually after going back over the data from the numbers perspective regarding the two cards (bandwidth, clock speed etc,) the 1080 Ti certainly might have the upper hand, I'd have to run some benchmarks myself

1

Star-Bandit t1_ix7anbd wrote

Reply to comment by Nerveregenerator in GPU QUESTION by Nerveregenerator

No, each K80 is about equal to 2 1080ti, if you look at the cards they each have two chip sets and about 12Gb of RAM to each chip, 24Gb total vram per card. But the issue is they get hot, when running a training model on them it can sit around 70°c. But it's nice to be able to assign each chip set to different tasking.

0

Nerveregenerator OP t1_ix75olf wrote

Reply to comment by scraper01 in GPU QUESTION by Nerveregenerator

So I did some research. According to the lambda labs website, 4 1080s combined will get me 1.5x throughout as a 3090 with FP32 training. FP16 seems to yield a 1.5x speed up for the 3090 for training. So even with mixed precision, it comes out to be the same. The actual configuration of 4 cards is not something I’m very familiar with, but I wanted to point this out as it seems like NVIDIA has really bullshitted a lot with their marketing. A lot of the numbers they throw around just don’t translate to ML.

2

scraper01 t1_ix6t386 wrote

Four 1080 ti will get you the performance of a single 3090 if you are not using mixed precision. Once tensor cores are enabled, difference is night and day. Training and inference, a single 3090 will blow your multi GPU rig out of the water. On top of that, you'll need a motherboard plus a CPU with lots of PCIE lanes, and those ain't cheap. Pro grade stuff with enough lanes will be north of 10k. Not worth it.

11

Star-Bandit t1_ix6l9wf wrote

You might also check some old server stuff, I have a Dell R720 running two Tesla K80's which is essentially the equivalent of 2 1080s per card. While it may not be the latest and greatest, the server ran me $300 and the two cards ran me $160 from eBay.

3

suflaj t1_iwyuyyo wrote

If your goal is to work on those things, you should look into getting a PhD, as you'll need to work at a fairly large company to even have a chance of working on those things, and the competition is fierce, so you need papers and good words for your name to push through.

In a year at that pace I assume you can cover the beginnings of deep learning till 2018 or 2019 (assuming 5 hours every day is around 1825 hours, which amounts to around 150 papers read thoroughly). Andrew Ngs course is OK, but it doesn't come close to being enough for what you need for your aspirations. You'll probably need one more year of reading papers and experimenting after that to reach state of the art.

0