Recent comments in /f/deeplearning
Nerveregenerator t1_ix94a1y wrote
Don’t overthink it. It takes a long time, so just do whatever interests you
Nerveregenerator OP t1_ix92czu wrote
Reply to comment by chatterbox272 in GPU QUESTION by Nerveregenerator
Ok, thanks I think that clears up the drawbacks. I’d have to check which motherboard I’m using now, but generally would you expect a 3090 to be compatible with the motherboard that works with a 1080ti? Thanks
C0demunkee t1_ix85rbq wrote
Reply to comment by Star-Bandit in GPU QUESTION by Nerveregenerator
I did this with a M40 24gb, super cheap, no video out, lots of cuda cores, does all the ML/AI stuff I want it to do.
AmazingKitten t1_ix7w3vr wrote
Reply to GPU QUESTION by Nerveregenerator
3090 is better. Single gpu training is easier and it will consume less power. Plus, you can still add another one later.
Dexamph t1_ix7onhf wrote
Reply to comment by Star-Bandit in GPU QUESTION by Nerveregenerator
I think you way overestimated K80 performance when my 4GB GTX 960 back in the day could trade blows with a K40, which was a bit more than half a K80. In a straight memory bandwidth fight, like Transformer model training, the 1080Ti is going to win hands down even if you have perfect scaling acorss both GPUs on the K80 and that's assuming it doesn't get hamstrung by the ancient Kepler architecture in any way at all.
chatterbox272 t1_ix7mx5j wrote
Reply to GPU QUESTION by Nerveregenerator
>the cost/performance ratio for the 1080's seems great..
Only if your time is worthless, your ongoing running costs can be ignored, and expected lifespan is unimportant.
Multi-GPU instantly adds a significant amount of complexity that needs to be managed. It's not easy to just "hack it out" and have it work under multi-GPU, you either need to use frameworks that provide support (and make sure nothing you want to do will break that support), or you need to write it yourself. This is time and effort you have to spend that you otherwise wouldn't with a single GPU. You'll have limitations with respect to larger models, as breaking up a model over multiple GPUs (model parallelism) is way more complicated than breaking up batches (data parallelism). So models >11GB for a single element are going to be impractical.
You'll have reduced throughput unless you have a server, since even HEDT platforms are unlikely to give you 4 PCIe Gen3 x16 slots. You'll be on x8 slots at best, and most likely on x4 slots. You're going to be pinned to much higher end parts here, spending more on the motherboard/cpu than you would need to for a single 3090.
It's also inefficient as all buggery. The 3090 has a TDP of 350W, the 1080Ti has 250W. That means for the same compute you're drawing roughly (TDP is a reasonable but imperfect stand in for true power draw) 3x the power for that compute. That will drastically increase the running cost of the system. Also a more expensive power supply and possibly even needing to upgrade the wall socket to allow you to draw that much power (4 1080Ti to me means a 1500W PSU minimum, which would require a special 15A socket in Australia where I live).
You're also buying cards that are minimum 3 years old. They have seen some amount of use, and use in a time where GPU mining was a big deal (so many of the cards out there were pushed hard doing that). The longer a GPU has been out of your possession, the less you can rely on how well it was kept. The older arch will also be sooner dropped for support. Kepler was discontinued last year, so we have Maxwell and then Pascal (where the 10 series lies). Probably a while away, but a good bit sooner than Ampere (which has to wait through Maxwell, Pascal, Volta, and Turing before it hits the chopping block).
TL; DR:
Pros: Possibly slightly cheaper upfront
Cons: Requires more expensive hardware to run, higher running cost, shorter expected lifespan, added multi-GPU complexity, may not actually be compatible with your wall power.
TL; DR was TL; DR: Bad idea, don't do it.
RichardBJ1 t1_ix7ficv wrote
Reply to GPU QUESTION by Nerveregenerator
I think if you get even similar performance with one card versus 4 cards the former is going to be far less complex to set up!? Just the logistics of that sounds a nightmare.
incrediblediy t1_ix7czdr wrote
Reply to comment by Nerveregenerator in GPU QUESTION by Nerveregenerator
> 4 1080s combined will get me 1.5x throughout as a 3090 with FP32 training. FP16 seems to yield a 1.5x speed up for the 3090 for training.
I think that's when only comparing CUDA cores without Tensor cores, anyway you can't merge VRAM together for large models
incrediblediy t1_ix7cqrg wrote
Reply to GPU QUESTION by Nerveregenerator
> 4 1080Ti's or 1 3090 > ebay for 200 bucks
you can also get an used 3090 for the same price of 4*200, also you can use 24 GB VRAM for training larger models
Star-Bandit t1_ix7avv6 wrote
Reply to comment by Star-Bandit in GPU QUESTION by Nerveregenerator
Actually after going back over the data from the numbers perspective regarding the two cards (bandwidth, clock speed etc,) the 1080 Ti certainly might have the upper hand, I'd have to run some benchmarks myself
Star-Bandit t1_ix7anbd wrote
Reply to comment by Nerveregenerator in GPU QUESTION by Nerveregenerator
No, each K80 is about equal to 2 1080ti, if you look at the cards they each have two chip sets and about 12Gb of RAM to each chip, 24Gb total vram per card. But the issue is they get hot, when running a training model on them it can sit around 70°c. But it's nice to be able to assign each chip set to different tasking.
Nerveregenerator OP t1_ix76d92 wrote
Reply to GPU QUESTION by Nerveregenerator
I believe 2 k80s = 1 1080ti.
Nerveregenerator OP t1_ix75p1k wrote
Reply to comment by Star-Bandit in GPU QUESTION by Nerveregenerator
Will look into that.
Nerveregenerator OP t1_ix75olf wrote
Reply to comment by scraper01 in GPU QUESTION by Nerveregenerator
So I did some research. According to the lambda labs website, 4 1080s combined will get me 1.5x throughout as a 3090 with FP32 training. FP16 seems to yield a 1.5x speed up for the 3090 for training. So even with mixed precision, it comes out to be the same. The actual configuration of 4 cards is not something I’m very familiar with, but I wanted to point this out as it seems like NVIDIA has really bullshitted a lot with their marketing. A lot of the numbers they throw around just don’t translate to ML.
scraper01 t1_ix6t386 wrote
Reply to GPU QUESTION by Nerveregenerator
Four 1080 ti will get you the performance of a single 3090 if you are not using mixed precision. Once tensor cores are enabled, difference is night and day. Training and inference, a single 3090 will blow your multi GPU rig out of the water. On top of that, you'll need a motherboard plus a CPU with lots of PCIE lanes, and those ain't cheap. Pro grade stuff with enough lanes will be north of 10k. Not worth it.
Star-Bandit t1_ix6l9wf wrote
Reply to GPU QUESTION by Nerveregenerator
You might also check some old server stuff, I have a Dell R720 running two Tesla K80's which is essentially the equivalent of 2 1080s per card. While it may not be the latest and greatest, the server ran me $300 and the two cards ran me $160 from eBay.
suflaj t1_ix4ab1j wrote
Reply to comment by Terib1e in Question, I am a newbie. I have just finished cs50p and currently learning Django. by Terib1e
Yeah, probably the best one
Terib1e OP t1_ix479wn wrote
Reply to comment by suflaj in Question, I am a newbie. I have just finished cs50p and currently learning Django. by Terib1e
Hey, I'm currently doing introductory lessons on Kaggle. Is that a good website for learning basics of ML
suflaj t1_iwze762 wrote
Reply to comment by Terib1e in Question, I am a newbie. I have just finished cs50p and currently learning Django. by Terib1e
Better ask here so everyone can make use of it
Terib1e OP t1_iwz0yne wrote
Reply to comment by suflaj in Question, I am a newbie. I have just finished cs50p and currently learning Django. by Terib1e
Thanks a lot. I have few more questions, can I ask you in DMs?
suflaj t1_iwyzm03 wrote
Reply to comment by Terib1e in Question, I am a newbie. I have just finished cs50p and currently learning Django. by Terib1e
There may be a few papers under a paywall (one that comes to mind is the Differentiable Neural Computer), but those are not that important. Most are free, yes.
Terib1e OP t1_iwyxrt4 wrote
Reply to comment by suflaj in Question, I am a newbie. I have just finished cs50p and currently learning Django. by Terib1e
The papers you are talking about-"are they available for free? Or do I have to buy them?"
suflaj t1_iwyuyyo wrote
Reply to Question, I am a newbie. I have just finished cs50p and currently learning Django. by Terib1e
If your goal is to work on those things, you should look into getting a PhD, as you'll need to work at a fairly large company to even have a chance of working on those things, and the competition is fierce, so you need papers and good words for your name to push through.
In a year at that pace I assume you can cover the beginnings of deep learning till 2018 or 2019 (assuming 5 hours every day is around 1825 hours, which amounts to around 150 papers read thoroughly). Andrew Ngs course is OK, but it doesn't come close to being enough for what you need for your aspirations. You'll probably need one more year of reading papers and experimenting after that to reach state of the art.
FairMathematician595 OP t1_iwxmn9b wrote
Reply to comment by FairMathematician595 in Medicinal Dataset Review. by FairMathematician595
If you like the dataset please upvote the dataset on kaggle. It would help others too.
RichardBJ1 t1_ix9fcaw wrote
Reply to How to start deep learning from scratch? by Ok_Cartographer3000
His book has some nice examples, works well. Really as the other answer has said though you need to follow your interests and apply those examples to something that interests you. Another idea is Kaggle; you can clone others code quite legitimately and understand what they were up to. So many examples on Kaggle you’ll surely find something that fits your interests!! Good luck