Recent comments in /f/deeplearning
Learning_DL OP t1_j80uxdo wrote
Reply to comment by Learning_DL in My Machine Learning: Learning Program by Learning_DL
I have tested cv and NLP and I am more interested in NLP to start with and maybe in futur do cv
Learning_DL OP t1_j80upiv wrote
Reply to comment by DJStillAlive in My Machine Learning: Learning Program by Learning_DL
No, I mean I will work 6h a day including weekends
DJStillAlive t1_j80ujr5 wrote
Reply to My Machine Learning: Learning Program by Learning_DL
From Friday at 5 PM to Monday at 8 AM is 63 hours, minus 24 for sleeping and you're left with 39 hours. If your plan is to spend 40 hours a weekend for 6 months, I'm afraid you're going to burn out hard. You're also narrowing your focus to NLP. While there's nothing inherently wrong with that, what happens if you end up wanting to do CV work or something completely different?
Not trying to dissuade you and I know nothing of your situation, but have you thought this through?
Learning_DL OP t1_j80q6my wrote
Reply to comment by Cute-Regular9227 in My Machine Learning: Learning Program by Learning_DL
Yes, I will use PyTorch for my projects.
Learning_DL OP t1_j80q4y0 wrote
Reply to comment by No-Trifle2470 in My Machine Learning: Learning Program by Learning_DL
Yes, I will some projects to apply what I have learned. But I don’t know if what I have planned to learn and the ressources is good
Cute-Regular9227 t1_j80ouvz wrote
Reply to My Machine Learning: Learning Program by Learning_DL
It really depend from person to person. Some may need more and others less. But test you have nothing to lose. I would add some PyTorch learning to build cool projects
No-Trifle2470 t1_j80oiun wrote
Reply to My Machine Learning: Learning Program by Learning_DL
I think it is a nice plan! I am would say it is enough but maybe spend more time building projects also not only learning
nonamefhh t1_j7yygkm wrote
Reply to comment by nonamefhh in Entry to a career in deep learning by No-Celebration6994
Another tipp: Read job offers like they are jokes. The person to fill the job requirements never exist.
nonamefhh t1_j7yy2z2 wrote
Reply to Entry to a career in deep learning by No-Celebration6994
I recently graduated from my computer science master degree and got a job as an ML Engineer. I am not only interested in pure data science.
Here is how I did it:
-
Realise university doesn't teach the necessary knowledge --> so I did loads of courses and read books to get a foundation.
-
With that knowledge, I was able to get a student job in the ml field. Finding that job took loads of hours researching in job portals.
-
I had a study project for which I was allowed to write my master thesis, including object detection.
-
End of university: Be prepared to move your location. I almost never found an interesting (entry) job in my hometown. It isn't easy to find an entry job, as you said, but they exist. Basically, search on every available job platform. Do your homework and write an appealing application. If you get invited, be relaxed and turn the conversation towards topics you know well. But most importantly, I realised it isn't worth taking a job if you don't vibe with the people you speak to during an interview. You can learn your craftsmanship, but it is incredible hard to work with people you dont like.
Basically, you spend 2-3 months researching and applying to EVERY job you can find, which doesn't sound like they are searching for a unicorn. The best job descriptions are rather short and precise. If they search for a senior don't immediately go on. It is very hard for companies to find a senior. We lost a colleague recently and we already know that we won't find a replacement for at least a year ... and as you might guess the position got replaced by a junior.
allanmeter t1_j7ytp7i wrote
Reply to comment by suflaj in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx
This is really good advice! Preprocessing input data for both training and inferencing is the best route to get efficiencies. Don’t feed it crazy large multidimensional dataset, try and break it up and have a look at if you can use old fashioned methods on windowing and down sampling.
Also model parameters type is important too. If you’re running fp64 then you will struggle vs a model that’s just int8. If you have mixed precision weights then you really need to think about looking at AWS Sage and get a pipeline going.
To OP, maybe you can share a little context on what models you’re looking to run? Or input data context.
allanmeter t1_j7yt39v wrote
Reply to comment by one_eyed_sphinx in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx
Threadripper and Epyc purely to maximise your access to L3 cache as well. Yes lanes and cores are important too. TR and Epyc really are well engineered chips to handle sustained compute or memory optimised workloads too.
Some models use multiple GPUs with either a strategy that copied data, and then there are models that would segment layers and minimise copies of data. Hence have a look at the distribution strategies being used, and how the models support them. Some models even use the CPU as a collation model to merge split datasets and weights, I’ve rarely seen these models perform well, they’re usually highly optimised with deep layers.
Lastly there’s no real golden ratio to the Ram, vram and swap ratio, let the OS handle it, provide as much as you can, and bias towards random IOPs as the measure.
Also please keep an eye on your nvidia-sim, use the watch -n 1 nvidia-smi to keep an eye on voltage and utilisation and temperature. You might be going the exotic route and explore water cooling, else make sure there is ample room to get cool air flowing through.
Best of luck, keep at it.
one_eyed_sphinx OP t1_j7yrvl2 wrote
Reply to comment by allanmeter in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx
whats your recommendation for VRAM:RAM:NVME ratios?
suflaj t1_j7yr906 wrote
Reply to comment by one_eyed_sphinx in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx
If GPU memory is the bottleneck then there is nothing you can viably do about that. If your GPU can't load the memory faster then you will need to get more rigs and GPUs if you want to speed up the loading in parallel.
Or you could try to quantize your models into something smaller that can fit in the memory, but then we're talking model surgery, not hardware.
one_eyed_sphinx OP t1_j7yr8st wrote
Reply to comment by ThomasBudd93 in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx
some of the people seem to connect it to the AMD processors and motherboards. do you think it's the reason?
nvidia is known to downgrade thier gaming GPU so people will buy the proffessional ones.
one_eyed_sphinx OP t1_j7yqh5v wrote
Reply to comment by suflaj in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx
>NVME
yeah, the GPU memory is horible bottleneck. I am trying to find ways to go around it but it doesnt seems there are too many best practices for it. is there a way to use pined memory for faster model data transfer?
one_eyed_sphinx OP t1_j7ypyu3 wrote
Reply to comment by allanmeter in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx
a minimum threadripper? you are saying this because of the number of lanes or the number of cores?
can you elaborate more on "Assuming you have a handle on data vs model distribution strategy"?
junetwentyfirst2020 t1_j7y3cro wrote
Reply to Entry to a career in deep learning by No-Celebration6994
What do you want to be doing exactly at this job? It’s a semi broad field, even with the specification of computer vision. I’ve usually seen computer vision broken down into: Capture, Perception, and 3D Reconstruction.
Deep learning usually happens on the Capture and Perception parts of the pipelines, because 3DR is Geometry and linear algebra.
Is this what you want?
[deleted] t1_j7x3iiz wrote
Reply to comment by earthsworld in Any of you know a local and Open Source equivalent to Eleven Labs text to speech AI ? by lordnyrox
[deleted]
agentfuzzy999 t1_j7vu3c6 wrote
Reply to Entry to a career in deep learning by No-Celebration6994
I:
- got a job writing production code
- graduated with a bachelors
- just got another job writing production code
In the span of a year.
I think being involved in multi-person DL projects/competitions/papers to put on your resume, and having good interviewing skills is genuinely more important than knowing the latest and greatest models and algorithms. It’s the same as the rest of the SWE field.
enterthesun t1_j7uqcx7 wrote
Reply to comment by levand in Is there any AI-distinguishing models? by Such_Share8197
You’re correct that the state of the arts will be close but that doesn’t mean that detection cannot train and predict on generated data. It’s like using synthetic data.
allanmeter t1_j7u6v50 wrote
Reply to comment by one_eyed_sphinx in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx
Yes the ram to vram transfer is not as crazy important as you think. Previously we hit this issue in the 3000 series as well, and as a result we supplemented with full TB Ram but still was not enough. Some models are incredibly greedy.
If you are on Linux, which is highly encouraged, also look to optimise your storage tier option for SWAP memory, which is similar to pagefiles in windows. You can define and mount extended Swap disks which you can trick out with multi TB nvme drives. Not same performance as RAM but last step optimisations, before you need to consider going to Quadro
allanmeter t1_j7u6egg wrote
You will need to be looking at Epyc or at a minimum Threadripper. I would highly encourage ECC memory if possible.
Assuming you have a handle on data vs model distribution strategy, you will need fast and ample RAM to help with data loading/offloading as you have correctly pointed out.
If in North America, plenty of choices available to you, else where in the world you will have to seek combinations out selectively as stock is always a issue.
suflaj t1_j7u2qyt wrote
Reply to comment by one_eyed_sphinx in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx
You want eco mode to run cooler and more efficient. As I said, the bottleneck is in the GPU, specifically its memory bandwidth, not in whatever the CPU can transfer. Modern CPUs can easily handle 3 high end GPUs at the same time, not just 2.
PCI speed has not been a bottleneck for several years, and will probably never be a bottleneck again with this form factor of GPUs. The GPU MEMORY is the bottleneck nowadays.
EDIT: And as someone else has said, yeah, you can use fast NVMEs as swap to avoid loading from disk. There used to be Optane for this kind of stuff, but well, that's dead.
ThomasBudd93 t1_j7u0ttw wrote
Reply to comment by one_eyed_sphinx in what cpu and mother board is best for Dual RTX 4090? by one_eyed_sphinx
Someone else directed me here:
https://discuss.pytorch.org/t/ddp-training-on-rtx-4090-ada-cu118/168366/12
I didn't read the whole thread and there is more in the NVIDIA forum (links can be found in the pytorch forum link from above). At least a few weeks ago it looked like the multi-GPU training for the RTX 4090s doesn't work fully where it does for the RTX 6000 Ada. Not sure if this is intended or just a bug. I called a company here in Germany and they even stopped selling multi RTX 4090 deep learning computers because of this. I asked them about the multi-GPU benchmark I saw from lambda labs and they replied that they reproduced that but saw that the training only resulted in nans. This is all I know. If you find out more, could you share it here? :) Thanks!
BellyDancerUrgot t1_j8111wa wrote
Reply to My Machine Learning: Learning Program by Learning_DL
It would depend person to person but imo considering u have it planned out u are likelier to succeed than not if u stick with what u came up with.