Recent comments in /f/deeplearning
LesleyFair OP t1_j6388vx wrote
Reply to comment by loopuleasa in ⭕ What People Are Missing About Microsoft’s $10B Investment In OpenAI by LesleyFair
That is true. However, I would argue they still made a considerable shift towards "closedness" from where they came from. Would you agree?
loopuleasa t1_j633y6s wrote
good read, but openai was never about being open source or anything
the name was just marketing
CatalyzeX_code_bot t1_j6331vr wrote
Found relevant code at https://github.com/nvidia/megatron-lm + all code implementations here
--
To opt out from receiving code links, DM me
Infamous_Age_7731 OP t1_j630pnc wrote
Reply to comment by GPUaccelerated in Cloud VM GPU is much slower than my local GPU by Infamous_Age_7731
Oh i see, thanks that fits my case then!
FastestLearner t1_j630p88 wrote
The thing with me is that I started with TensorFlow v1 back when PyTorch wasn’t even in the race, and because of the constant breaking changes to the TensorFlow API and cryptic error messages, my experience was hellish TBH. Even getting support from stackoverflow was messed up because people would be posting solutions for different API versions. Then PyTorch got released and boy was it the savior I needed. It literally saved me hundreds of hours of debugging (and possibly from brain hemorrhage too). Compared to the burning hell TF1 was, PT was like coding on a serene beach. And then TensorFlow v2 came out with eager execution, that promised PyTorch way of doing things. But then the question is, why switch if it is the same as PyTorch? And so I didn’t.
I’m coming from a research point of view. If I was coming from a production POV, things could’ve been different.
trajo123 t1_j62yj1s wrote
Convention coming from linear algebra: Ax+B where B is a vector of real numbers, positive or negative.
What makes you feel that subtracting is in any way more meaningful?
Blasket_Basket t1_j62eegq wrote
All subtraction is just addition of a negative number. The model will learn a value for the bias parameter, which can be between negative infinity and infinity
sEi_ t1_j6264vy wrote
If you add a negative number then you are adding a subtraction if you get my point.
Afaik, So if you say adding or substracting bias doesn't matter.
BellyDancerUrgot t1_j620nyw wrote
Reply to comment by SometimesZero in best deep learning reference by Reasonable-Ball9018
Fair enough. But I think the book is chefs kiss and Andrew Ng courses with reference from the book is a perfect balance of easy and moderate to hard topics to study.
Zealousideal-Copy463 OP t1_j61j6n2 wrote
Reply to comment by incrediblediy in Best cloud to train models with 100-200 GB of data? by Zealousideal-Copy463
I was checking Marketplace, couldn't find any used below 1500$. Also, I just discovered that 3090 is 2.2k$ here now lol (that would be the cheapest option)... meanwhile in BestBuy it costs 1k$, was just thinking about traveling to the US with the other k lol
incrediblediy t1_j61evor wrote
Reply to comment by Zealousideal-Copy463 in Best cloud to train models with 100-200 GB of data? by Zealousideal-Copy463
Ah! I am also not from USA. I got my used 3090 for ~US$900. Could be cheaper now. 3090 & 4090 has same VRAM (24 GB)
GPUaccelerated t1_j60zhrx wrote
It's simply because the 3080Ti is actually a faster GPU than the A100. The reason the A100 exists is to fit large models without having to parallelize across multiple cards. *For most cases*
v2thegreat t1_j60fvmd wrote
Reply to comment by Zealousideal-Copy463 in Best cloud to train models with 100-200 GB of data? by Zealousideal-Copy463
Well, there are ec2 instances that are already setup. How often do you do this sort of thing? It might be justified to build your own home setup, but as someone who does that themselves, I can tell you it's kinda tedious and you end up being your own IT guy
suflaj t1_j5zlq6k wrote
Aside from what others have mentioned, let's assume that we don't have a symmetrical situation, i.e. that the range of the function we're learning, as well as the domain of weights and biases, is [0, inf>. Then it makes more sense to add bias than to subtract it, as it will lead to smaller weights and less chance to overflow or for the gradients to explode.
It makes more sense to subtract the biases if in the scenario described above, you want a more expressive layer, but with less numerical stability. This is because a subtractive bias allows the weights to be of greater magnitude, which in terms gives you more effective range for the weights.
But note that neural networks are not done with integer weights, and in some libraries there is no autograd for integers even.
OutrageousSundae8270 t1_j5zf7ko wrote
Reply to comment by perrohunter in Which is your go to framework for deep learning, in python by V1bicycle
PyTorch is great, its honestly much easier to use than TensorFlow, especially for beginners. TensorFlow however offers everything PyTorch does through heavy use of object oriented design (primarily inheritance).
The functional model in TensorFlow is very similar to the default way of instantiating models in PyTorch. TensorFlow has both many many convenience wrappers but also gives you the full freedom that PyTorch does, given that you are able to deal with the nuances and complexities of object-oriented design and refer heavily to the documentation.
perrohunter t1_j5zekuw wrote
Reply to comment by OutrageousSundae8270 in Which is your go to framework for deep learning, in python by V1bicycle
That but PyTorch has also matured a lot, we tried to switch in 2018 and deploy the new shining model to production but back then PyTorch had terrible performance, now that’s not the case, it had matured and I think will go farther on the shoulders of the community.
perrohunter t1_j5zek3u wrote
Reply to comment by OutrageousSundae8270 in Which is your go to framework for deep learning, in python by V1bicycle
That but PyTorch has also matured a lot, we tried to switch in 2018 and deploy the new shining model to production but back then PyTorch had terrible performance, now that’s not the case, it had matured and I think will go farther on the shoulders of the community.
OutrageousSundae8270 t1_j5ze525 wrote
Reply to comment by perrohunter in Which is your go to framework for deep learning, in python by V1bicycle
TensorFlow versus PyTorch is a matter of taste imo, I just like TensorFlow because I find it more intuitive than PyTorch but everyone is different.
perrohunter t1_j5z8juy wrote
I’ve used TensorFlow since 2016 but I grew tired of it, it’s not really open source, it’s just Google sharing their tool, so I switched to PyTorch as they do iterate faster and care more about the community
mrdevlar t1_j5z7mnn wrote
They are the same thing.
> 1 + (-1) = 0
> 1 - 1 = 0
incrediblereddit t1_j5yyqnn wrote
Reply to comment by ArthurLCTTheCool in Why add bias instead of subtracting bias? by ArthurLCTTheCool
While training the model, bias will be trained accordingly. Suppose ideal bias value is +x, if you are subtracting the bias instead of +x bias value will become -x, so overall w -(-x) will become w+x and vice versa.
ArthurLCTTheCool OP t1_j5yuwla wrote
Reply to comment by incrediblereddit in Why add bias instead of subtracting bias? by ArthurLCTTheCool
Yeah I get that but if you add a positive bias?
nibbajenkem t1_j5yuece wrote
Doesn't matter. The bias can be negative if that is what the model learns
incrediblereddit t1_j5yube5 wrote
Reply to comment by BrendanKML in Why add bias instead of subtracting bias? by ArthurLCTTheCool
Yes. Adding negative bias is same as subtraction, so to say, similarly subtracting negative bias is like addition. Ultimately it doesn't make a difference.
skeerp t1_j63kd2s wrote
Reply to ⭕ What People Are Missing About Microsoft’s $10B Investment In OpenAI by LesleyFair
Thanks for the thorough post!