Recent comments in /f/deeplearning

Smallpaul t1_iztcll2 wrote

It’s weird that you say they are failing. If you asked a human to highlight the face In that picture they would do the exact same thing!

Your application might need something different but don’t call this “failing.” It’s succeeding at what it was designed to do, which is find faces.

What is your application by the way?

1

sqweeeeeeeeeeeeeeeps t1_izt8ldx wrote

Pytorch / Keras / Tensorflow for deep learning

And any basic ML library you want, scitkit leaen etc.

Deep learning is all about GPU usage and running long experiments in production. I’m confused what you even want

Is the question basically asking, what skills would someone specialized in DL have vs someone specializing in non-DL ML have?

−1

tech_ml_an_co t1_izt4bd0 wrote

Quite different tech stack for APIs. DL requires some kind of model server with GPU. For traditional ML use Lambda or FastAPI on a server.

For batch processing it's more similar, depending on your data size, you might not need a GPU even for Deep learning.

Also deep learning is usually unstructured data, which requires different storage and training infrastructure.

You can read books about that topic however at the core that's the difference a that's why a lot of companies still don't utilize DL.

2

suflaj t1_izruvvi wrote

That makes no sense. Are you sure you're not doing backprop on the teacher model? It should be a lot less resource intensive.

Furthermore, check how you're distilling the model, i.e. what layers and what weights. Generally, for transformer architectures, you distill the first, embedding layer, the attention and hidden layers, and the final, prediction layer. Distilling only the prediction layer works poorly.

2

MazenAmria OP t1_izrgk9j wrote

I'm already using a pretrained model as the teacher model. But the distillation part itself has nearly the cost of training a model. I'm not insisting but I feel like I'm doing something wrong and needed some advices (note that I've only had theoritical experience in such areas of research, this is the first time I'm doing it practically).

Thanks for you comments. gif

1