Recent comments in /f/deeplearning
sqweeeeeeeeeeeeeeeps t1_izt8ldx wrote
Reply to comment by digital-bolkonsky in What’s different between developing deep learning product and typical ML product? by digital-bolkonsky
Pytorch / Keras / Tensorflow for deep learning
And any basic ML library you want, scitkit leaen etc.
Deep learning is all about GPU usage and running long experiments in production. I’m confused what you even want
Is the question basically asking, what skills would someone specialized in DL have vs someone specializing in non-DL ML have?
MazenAmria OP t1_izt68w9 wrote
Reply to comment by suflaj in Advices for Deep Learning Research on SWIN Transformer and Knowledge Distillation by MazenAmria
I'm using with torch.no_grad(): when calculating the output of the teacher model.
digital-bolkonsky OP t1_izt5hrz wrote
Reply to comment by [deleted] in What’s different between developing deep learning product and typical ML product? by digital-bolkonsky
How do you manage the GPU issue in building an api?
digital-bolkonsky OP t1_izt53ya wrote
Reply to comment by sqweeeeeeeeeeeeeeeps in What’s different between developing deep learning product and typical ML product? by digital-bolkonsky
The question is about development and tech stack
digital-bolkonsky OP t1_izt52tm wrote
Reply to comment by [deleted] in What’s different between developing deep learning product and typical ML product? by digital-bolkonsky
How do we assess the GPU need in production?
digital-bolkonsky OP t1_izt4z79 wrote
Reply to comment by tech_ml_an_co in What’s different between developing deep learning product and typical ML product? by digital-bolkonsky
Right so when it comes to computing if I am building a DL api for someone. How should I address the computing issue?
tech_ml_an_co t1_izt4bd0 wrote
Reply to What’s different between developing deep learning product and typical ML product? by digital-bolkonsky
Quite different tech stack for APIs. DL requires some kind of model server with GPU. For traditional ML use Lambda or FastAPI on a server.
For batch processing it's more similar, depending on your data size, you might not need a GPU even for Deep learning.
Also deep learning is usually unstructured data, which requires different storage and training infrastructure.
You can read books about that topic however at the core that's the difference a that's why a lot of companies still don't utilize DL.
TheButteryNoodle OP t1_izt1ogh wrote
Reply to comment by mosalreddit in GPU Comparisons: RTX 6000 ADA vs A100 80GB vs 2x 4090s by TheButteryNoodle
Will do!
mosalreddit t1_izsyu2s wrote
Reply to comment by TheButteryNoodle in GPU Comparisons: RTX 6000 ADA vs A100 80GB vs 2x 4090s by TheButteryNoodle
Looking forward to seeing it when done.. Please do share the pictures
suflaj t1_izswui5 wrote
Reply to What’s different between developing deep learning product and typical ML product? by digital-bolkonsky
From the top of my head, DL requires much more data preprocessing and research. ML is more like - fit an XGBoost model, and if it doesn't work well, see why, fix that in data and try again. If XGBoost can't solve it, your data is bad or you need DL.
[deleted] t1_izss6p2 wrote
sqweeeeeeeeeeeeeeeps t1_izspv5o wrote
Reply to comment by MazenAmria in Advices for Deep Learning Research on SWIN Transformer and Knowledge Distillation by MazenAmria
Showing you can create a smaller model with the same performance means SWIN is overparameterized for that given task. Give it datasets with varying complexity, not just one single one.
sqweeeeeeeeeeeeeeeps t1_izspjid wrote
Reply to What’s different between developing deep learning product and typical ML product? by digital-bolkonsky
Google difference between ML and Deep Learning.
TheButteryNoodle OP t1_izskz4u wrote
Reply to comment by mosalreddit in GPU Comparisons: RTX 6000 ADA vs A100 80GB vs 2x 4090s by TheButteryNoodle
Haven't purchased the motherboard yet, but the case would be a fractal design torrent. In order to get 2 4090s to fit you would need to go custom liquid cooling to get rid of the massive heatsinks on the 4090.
iyke7991 t1_izryprz wrote
Reply to the biggest risk with generative AI is not its potential for misinformation but cringe. by hayAbhay
Not that cringe to be honest.
suflaj t1_izruvvi wrote
Reply to comment by MazenAmria in Advices for Deep Learning Research on SWIN Transformer and Knowledge Distillation by MazenAmria
That makes no sense. Are you sure you're not doing backprop on the teacher model? It should be a lot less resource intensive.
Furthermore, check how you're distilling the model, i.e. what layers and what weights. Generally, for transformer architectures, you distill the first, embedding layer, the attention and hidden layers, and the final, prediction layer. Distilling only the prediction layer works poorly.
RED_MOSAMBI t1_izrkuct wrote
Reply to Why popular face detection models are failing against cartoons and is there any way to prevent these false positives? by abhijit1247
Because they weren't trained against animated characters, try adding some image processing tools for converting animated to normal characters or normal and animated characters to some common filter
MazenAmria OP t1_izrgnco wrote
Reply to comment by pr0d_ in Advices for Deep Learning Research on SWIN Transformer and Knowledge Distillation by MazenAmria
I remember reading it, I'll read it again and discuss it. Thanks.
MazenAmria OP t1_izrgk9j wrote
Reply to comment by suflaj in Advices for Deep Learning Research on SWIN Transformer and Knowledge Distillation by MazenAmria
I'm already using a pretrained model as the teacher model. But the distillation part itself has nearly the cost of training a model. I'm not insisting but I feel like I'm doing something wrong and needed some advices (note that I've only had theoritical experience in such areas of research, this is the first time I'm doing it practically).
Thanks for you comments.
qpwoei_ t1_izrdg66 wrote
Reply to Why popular face detection models are failing against cartoons and is there any way to prevent these false positives? by abhijit1247
The result you observe is intentional. The training objective of face detection models is usually to detect faces in any kind of pictures: drawings, 3d renders, photos.
Superschlenz t1_izr0iii wrote
Reply to Why popular face detection models are failing against cartoons and is there any way to prevent these false positives? by abhijit1247
A foto of a face isn't a face either. That's why Apple's Face ID uses a 3D scanner in addition.
abhijit1247 OP t1_izr0f8f wrote
Reply to comment by RShuk007 in Why popular face detection models are failing against cartoons and is there any way to prevent these false positives? by abhijit1247
This is a great insight. Thanks for the help.
mosalreddit t1_izr087o wrote
What mobo and case do you have to put 2 4090s in?
abhijit1247 OP t1_izqxfer wrote
Reply to comment by CauseSigns in Why popular face detection models are failing against cartoons and is there any way to prevent these false positives? by abhijit1247
Due to lack of details of cartoon faces, the model should not detect these cartoon faces as faces with a high confidence score (about 86.8 %). If this is the case then these models are as bad as haar cascade based face detectors.
Smallpaul t1_iztcll2 wrote
Reply to Why popular face detection models are failing against cartoons and is there any way to prevent these false positives? by abhijit1247
It’s weird that you say they are failing. If you asked a human to highlight the face In that picture they would do the exact same thing!
Your application might need something different but don’t call this “failing.” It’s succeeding at what it was designed to do, which is find faces.
What is your application by the way?