Recent comments in /f/deeplearning
suflaj t1_j0fszwu wrote
Reply to comment by Moist-Bath5827 in I have 6x3090 looking to build a rig by Outrageous_Room_3167
Probably ROG Strix X570-E. Case doesn't matter much as long as it's a full tower, but as for the brand, Fractal usually makes good cases. Specifically, Fractal Torrent is the best all around case IMO, but they're sort of hard to find.
lazazael t1_j0fr6zz wrote
Reply to comment by lazazael in laptop for Data Science and Scientific Computing: proart vs legion 7i vs thinkpad p16/p1-gen5 by macORnvidia
if not the cloud because u don't want ongoing payment than remote compute on a desktop with 256 ram&4090 heating the office, an instance like that is a beast for ML compared to these...slim contenders, university professors usually do that, they buy a heavy lifter for a few of them to use freely from their mb airs or thinkpads
M4mb0 t1_j0fn23x wrote
The first question is did you get two slot blower cards or 3+ slot gaming cards.
The second is do you want to build an actual server where noise levels are irrelevant or a somewhat quiet box.
elbiot t1_j0fitwk wrote
Reply to Efficient Max Pooling Implementation by Logon1028
Can't you just reshape the array and use argmax (so no as_strided). Reshaping is often free. You'd have to do some arithmetic to get the indices for the original shape, but it would just be one operation
I.e. you can take a shape (99,) array and reshape it to (3,33) and then get 33 maxes.
BrotherAmazing t1_j0fai9p wrote
Reply to Why does adding a smaller layer between conv and dense layers break the model? by rubbledubbletrubble
In this case, I don’t think anyone can tell you wtf is going on without a copy of your code and dataset. There are just so many unknowns, but is this 1000 dim dense layer the last layer before a softmax?
Are you training the other layers then adding this new layer with new weight initialization in between the trained layers, or are you adding it in as a new architecture and re-initializing the weights everywhere and starting from scratch again?
Moist-Bath5827 t1_j0f0w5y wrote
Reply to comment by suflaj in I have 6x3090 looking to build a rig by Outrageous_Room_3167
Do you have a recommendation for 3x? At least the mobo and case?
100drunkenhorses t1_j0eir9u wrote
I enjoy this type of build. Depending on your work space. They sell extruded aluminum meant for GPU Mining. It's got room for eATX mobos and space for 2 big PSUs. They are spaced far enough apart that if you get the 3000 rpm noctua industrial fans and line them up you can cool that many 3090s on a single rig. If you are willing to cough up enough for 6 PCIe 4.0 x16 risers. Remember they are finicky at best so make sure you keep your warranty papers.
ribeirao t1_j0e6lkg wrote
Reply to comment by suflaj in I have 6x3090 looking to build a rig by Outrageous_Room_3167
thanks for the keyword, I’ll keep this in mind when/if I buy another 3090
suflaj t1_j0e646v wrote
Reply to comment by ribeirao in I have 6x3090 looking to build a rig by Outrageous_Room_3167
You can always make a model parallelized model and have it on any card, not that hard. Your biggest problem is load balancing in that case, but it can be done with a bit of benchmarking and heuristics.
ribeirao t1_j0e5fow wrote
Reply to comment by suflaj in I have 6x3090 looking to build a rig by Outrageous_Room_3167
Not op but that’s good to know, so it would only speed the process and not make a big gpu with 24+24 gbs :(
suflaj t1_j0e2job wrote
Reply to comment by twobadkidsin412 in I have 6x3090 looking to build a rig by Outrageous_Room_3167
A mining rig has a load significantly different from DL loads. I work with these cards, we have like 10 rigs in the office with dual/triple 3090s, some Tis, I'm well aware of their limits.
twobadkidsin412 t1_j0e2foy wrote
Reply to comment by suflaj in I have 6x3090 looking to build a rig by Outrageous_Room_3167
I had a box fan in front of my mining rig with 6 cards. Was hot but worked great.
lazazael t1_j0e2bfs wrote
Reply to laptop for Data Science and Scientific Computing: proart vs legion 7i vs thinkpad p16/p1-gen5 by macORnvidia
laptop is either a macbook or thinkpad for me, i'd buy for portability and use cloud resources/ university servers for actual computation
rubbledubbletrubble OP t1_j0du040 wrote
Reply to comment by suflaj in Why does adding a smaller layer between conv and dense layers break the model? by rubbledubbletrubble
Thank! I’ll give this a shot!
suflaj t1_j0dt970 wrote
Reply to comment by rubbledubbletrubble in Why does adding a smaller layer between conv and dense layers break the model? by rubbledubbletrubble
Not really, 950 is smaller than 1000 so not only are you destroying information, but you are potentially getting into a really bad local minimum.
When you add that intermediate layer, what you are essentially doing is random hashing your previous distribution. If your random hash kills the relations between data your model learned, then of course it will not perform.
Now, because Xavier and Kaiming-He initializations aren't exactly initializations to get the functionality of a universal random hash, they might not kill all your relations, but they are still random enough to have the potential depending on the task and data. You might get lucky, but on average, you will almost never get lucky.
If I was in your place I would train with linear warmup to a fairly large learning rate, like 10x higher than previous maximum. This will make very bad weights shoot out of their bad minima once LR reaches the max and hopefully you'll get better results once they settle down as the LR falls down. Just make sure you clip your gradients so your weights don't go to NaN, because this is the equivalent of driving your car into a wall in hopes of the crash turning it into a Ferrari.
As for how long you should train it... Well, the best would be to add the layer without any nonlinear function and see how much you need to reach original performance. Since there is no non-linear function the new network is equally as expressive as the original. Once you get the number of epochs, add like 25% to that number and train the one with the non-linear transformation after your bottleneck that long.
rubbledubbletrubble OP t1_j0dsv9t wrote
Reply to comment by suflaj in Why does adding a smaller layer between conv and dense layers break the model? by rubbledubbletrubble
I am doing this at the last layer. That is why it doesn’t make sense to me. I’d assume with 950 I should get similar results.
suflaj t1_j0drngf wrote
Reply to comment by rubbledubbletrubble in Why does adding a smaller layer between conv and dense layers break the model? by rubbledubbletrubble
It should, but how much it's tough to say and it depends on the rest of the model and where this bottleneck is. If, say, you're doing this is the first layers, the whole model basically has to be retrained from scratch, and performance similar to previous one is not guaranteed.
rubbledubbletrubble OP t1_j0drg5e wrote
Reply to comment by suflaj in Why does adding a smaller layer between conv and dense layers break the model? by rubbledubbletrubble
Yes, but shouldn’t the model still train and learn something?
I currently have an accuracy of 0.5% with the middle layer ranging from 100 to 950.
suflaj t1_j0dojq2 wrote
Reply to Why does adding a smaller layer between conv and dense layers break the model? by rubbledubbletrubble
You introduced a bottleneck. Either you needed to train it longer, or your bottleneck destroyed part of the information needed for better performance.
vade t1_j0dl85z wrote
I run 3x 3090 in a single case, without water cooling, but using one PCI riser and keeping the case open to allow for airflow. This is on a single 1600w PSU, no NVLink.
Anything more would be tough without a custom loop, and dual PSU.
Works great!
edit: I use a Fractal Design Design XL, and mount one 3090 FE vertically with a riser. Its janky but works.
suflaj t1_j0de8ja wrote
Reply to comment by Outrageous_Room_3167 in I have 6x3090 looking to build a rig by Outrageous_Room_3167
There is no larger memory. NVLink only increases bandwidth by up to 300 GB/s unless there is a software implementation of memory pooling, which there isn't for any relevant DL framework.
Every week this has to be explained to yet another aspiring system integrator...
Outrageous_Room_3167 OP t1_j0ddpsu wrote
Reply to comment by suflaj in I have 6x3090 looking to build a rig by Outrageous_Room_3167
>NVLink probably won't matter much since your CPU will be bottlenecked trying to send 5.2 TB/s of data to your GPUs. But again, there are no benchmarks to show how much, maybe the gains from NVLink will be noticeable.
I guess the bigger benefit of the NVLink is the larger memory, but aside from that, I don't think the performance gains are huge from what I've read. My thinking was to build out a chassis with external fans as well to cool everything down.
kaushik_ray_1 t1_j0daqqo wrote
Reply to comment by DingWrong in I have 6x3090 looking to build a rig by Outrageous_Room_3167
Mining rigs are not best for deep learning. A lot of mining rigs run on x1 pcie lane.
kaushik_ray_1 t1_j0daifi wrote
Look to get a HP Ml350 G9
It should be able to use 3 x 3090 easily at x16. A lot of the mining rigs only runs at x1 and not the best idea for deep learning.
Only problem I see is you will not get pcie4 with ml350 G9
daimor133 OP t1_j0g24cq wrote
Reply to [help] Need build with 2x - rtx 4090 cards in one PC by daimor133
also idk which PSU i need and CASE for data center(