Recent comments in /f/deeplearning

GPUaccelerated OP t1_iu4tflp wrote

This makes sense. Scaling horizontally is usually the case. Thank you for commenting!

But I would argue that hardware for inference is actually bought more than one would assume. I have many clients who purchase mini-workstations to put in settings where data processing and inference jobs are done in the same premise. To limit latency and data travel.

1

sabeansauce OP t1_iu45f4w wrote

for training. Essentially I have to choose between one powerful gpu or multiple average ones. But I know that the average ones on their own don't have enough space (because i have one) for the task at hand. I prefer the one gpu but company is asking if a multi-gpu setup of lesser capabilities will also work if used together.

3

hp2304 t1_iu3ixav wrote

Inference: If real time is a requirement then it's necessary to buy high end GPUs to reduce latency other than that it's not worth it.

Training: This loosely depends on how often is a model reiterated in production. Suppose if that period is one year (seems reasonable to me), which means new model will be trained on new data gathered over this duration plus old data. Doing this fast won't make a difference. I would rather use slow GPU even if take days or few weeks. It's not worth it.

A problem to DL models in general is they are only growing in terms of number of parameters. Requiring more VRAM to fit them in single GPU. Huge thanks to model parallelism techniques and ZERO which handles this issue. Otherwise one would have to buy new hardware to train large models. I don't like where AI research is headed. Increasing parameters is not an efficient solution, we need new direction to effectively and practically solve general intelligence. On top of that, models not detecting or misdetecting objects in self driving cars despite huge training datasets is a serious red flag showing we are still far from solving AGI.

3

ShadowStormDrift t1_iu3fkqs wrote

I code up a semantic search engine. I was able to get it down to 3 seconds for one search.

That's blazingly fast by my standard (used to take 45 minutes) that still haunts my dreams. If 10 people use the site simultaneously that's 30 seconds before number 10 gets his results back. Which is unacceptable.

So yes. I do care if I can get that done quicker.

3

sckuzzle t1_iu2aa7o wrote

We use models to control things in real-time. We need to be able to predict what is going to happen in 5 or 15 minutes and proactively take actions NOW. If it takes 5 minutes to predict what is going to happen 5 minutes in the future, the model is useless.

So yes. We care about speed. The faster it runs the more we can include in the model (making it more accurate).

13