Recent comments in /f/deeplearning
RemindMeBot t1_isfjwuw wrote
Reply to comment by obsoletelearner in Variational Autoencoder automatic latent dimension selection by grid_world
I will be messaging you in 6 hours on 2022-10-15 22:23:31 UTC to remind you of this link
CLICK THIS LINK to send a PM to also be reminded and to reduce spam.
^(Parent commenter can ) ^(delete this message to hide from others.)
| ^(Info) | ^(Custom) | ^(Your Reminders) | ^(Feedback) |
|---|
obsoletelearner t1_isfjs7i wrote
!RemindMe 6 hours
incrediblediy t1_ise5vkn wrote
are you trying to use FP16 ? because in most GPUs it is same or faster than FP32, FP64 is much slower
ex: RTX 3090 (https://www.techpowerup.com/gpu-specs/geforce-rtx-3090.c3622)
FP16 (half) performance 35.58 TFLOPS (1:1)
FP32 (float) performance 35.58 TFLOPS
FP64 (double) performance 556.0 GFLOPS (1:64)
grid_world OP t1_isduwtm wrote
Reply to comment by The_Sodomeister in Variational Autoencoder automatic latent dimension selection by grid_world
Retraining the model with reduced dimensions would be a *rough* way of _proving_ this. But the stochastic behavior of neural networks makes this hard to achieve.
Troll_of_the_bridge OP t1_iscoee4 wrote
Reply to comment by Karyo_Ten in Training speed using 32 vs 64-bit floating point by Troll_of_the_bridge
I didn’t know this, thanks!
The_Sodomeister t1_isci8nh wrote
Reply to comment by grid_world in Variational Autoencoder automatic latent dimension selection by grid_world
This is the part I'm referencing:
> The unneeded dimensions don't learn anything meaningful and therefore remain a standard, multivariate, Gaussian distribution. This serves as a signal that such dimensions can safely be removed without significantly impacting the model's performance.
How do you explicitly measure and use this "signal"? I don't think you'd get far by just measuring "farness away from Gaussian", as you'd almost certainly end up throwing away certain useful dimensions that may simply appear "Gaussian enough".
> I didn’t get your point of looking at the decoder weights to figure out whether they are contributing? Do you compare them to their randomly initiated values to infer this?
If the model reaches this "optimal state" where certain dimensions aren't contributing to the decoder output, then you should be able to detect this with some form of sensitivity analysis - i.e. changing the values in those dimensions shouldn't affect the decoder output, if those dimensions aren't being used.
This assumes that the model would correctly learn to ignore unnecessary latent dimensions, but I'm not confident it would actually accomplish that.
grid_world OP t1_ischbjf wrote
Reply to comment by The_Sodomeister in Variational Autoencoder automatic latent dimension selection by grid_world
I don’t think that the Gaussians are being output by a layer. In contrast with an Autoencoder, where a sample is encoded to a single point, in a VAE, due to the Gaussian prior, a sample is now encoded as a Gaussian distribution. This is the regularisation effect which enforces this distribution in the latent space. It cuts both ways, meaning that if the true manifold is not Gaussian, we still assume and therefore force it to be Gaussian.
A Gaussian signal being meaningful is something that I wouldn’t count on. Diffusion models are a stark contrast, but we aren’t talking about them. The farther a signal is away from a standard Gaussian, the more information it’s trying to smuggle through the bottleneck.
I didn’t get your point of looking at the decoder weights to figure out whether they are contributing? Do you compare them to their randomly initiated values to infer this?
Karyo_Ten t1_iscf3fp wrote
There is no way you are using 64-bit on the GPU.
All the CuDNN code is 32-bit for the very simple reason that non-Tesla GPUs have between 1/32 to 1/64 FP64 throughput compared to FP32.
See https://www.reddit.com/r/CUDA/comments/iyrhuq/comment/g93reth/
So under the hood your FP64 stuff is converted to FP32 when sent to GPU.
And on Tesla GPUs the ratio is 1/2.
The_Sodomeister t1_iscenee wrote
*If* your hypothesis is true (and I don't have enough direct experience with VAE to say for certain), then how would you distinguish those layers which are outputting approximately-Gaussian noise from the layers which are outputting meaningful signals? Whose to say that the meaningful signal doesn't also appear approximately Gaussian? Or at least sufficiently Gaussian to be not easily distinguishable from the others.
While I wouldn't go so far as to say that your hypothesis "doesn't happen", I also know from personal experience with other networks that NN models will tend to naturally over-parameterize if you let them. Regularization methods don't usually prevent the model from utilizing extra dimensions when it is able to, and it's not always clear whether the model could achieve the same performance with fewer dimensions vs. whether the extra dimensions are truly adding more representation capacity.
If some latent dimensions truly aren't contributing at all to the meaningful encoding, then I would think you could more likely identify this from looking at the weights in the decoder layers (as they wouldn't be needed to reconstruct the encoded input). I don't think this is as easy as it sounds, but I find it more plausible then determining this information strictly from comparing the distributions of the latent dimensions.
Gengar218 t1_isby14s wrote
Reply to comment by felix_awesome in Is it legal to use Rick and Morty for a research project? by felix_awesome
Xiph is a non-profit organisation that’s focussing on media compression. It’s a dataset of lossless videos for testing purposes. I‘m pretty sure they are safe to use. I‘m not a lawyer though.
Edit: Looks like Big Buck Bunny uses Creative Commons 3.0, not sure about the others.
https://peach.blender.org/about/
felix_awesome OP t1_isbwebi wrote
Reply to comment by Gengar218 in Is it legal to use Rick and Morty for a research project? by felix_awesome
Thanks, are these part of some dataset paper? These videos are okay to use for research purposes right?
_Arsenie_Boca_ t1_isbgyhs wrote
If the hardware is optimized for it, there probably is not a huge difference in speed, but the performance gain is probably negligible too.
The real reason people dont use 64bit is mainly memory usage. When you train a large model, you can fit much bigger 32bit/16bit batches into memory and thereby speed up training.
suflaj t1_isbgg32 wrote
Probably because the startup overhead dominates over the processing time. 500 weights is not really something you can apply to real life, as modern neural networks are 100+ million parameters for consumer hardware, and not on a dataset which is considered solved.
sutlusalca t1_isbclm9 wrote
Story: You went to the supermarket and bought a bottle of milk. Next day, you went again and bought two bottles of milk. You just spent a few seconds more for buying one more milk.
DJStillAlive t1_isaocrs wrote
You could probably get away with it under fair use laws, as I don't see that taking away from the value of the copyright and you appear to be using it for research purposes, but the problem with fair use is that it's decided AFTER the fact. The fact that it's a creative work and the volume of material you're using also works against you in this case. As u/soicyboii stated, asking them is your best bet if you insist on sticking with Rick and Morty.
spaceecon t1_isafid2 wrote
Reply to comment by soicyboii in Is it legal to use Rick and Morty for a research project? by felix_awesome
Yup, AS seems to be quite a cool studio, might just happen
Hiant t1_isa6yvq wrote
I'd try reaching out to cartoons made by public television or through national endowment of the arts grants, they may be more charitable when it comes to licensing for research that will be for the public good.
suflaj t1_is9njd4 wrote
It's not legal, you can look for videos that are in the public domain or under permissive licenses, but I really doubt you're going to find 20-25 hours of the same "style" unless it's just real life videos.
You can always take a camera, go outside, and record those videos yourself.
soicyboii t1_is9nf43 wrote
Maybe email them, you never know I've gotten some unexpected responses from people who I thought would never reply.
Gengar218 t1_is9icwr wrote
I can suggest these videos:
https://media.xiph.org/
Big Buck Bunny is especially popular for video demonstration in my experience.
Not sure if this can fill 20-25 hours though.
garlicoillemonsalt t1_is9gwdi wrote
Probably not the answer you want to hear but, it is most likely not ok. Good conferences/journals would likely not want to publish your work for the risk of exposing themselves to copyright infringement.
Try looking at the dataset used by any prior art papers you will be citing and if you can reuse these or other sets from the same sources. Generating good quality CV datasets is a challenge in itself, using an existing set not only saves you headaches in its creation but also makes it easier for people to evaluate your work against others.
ledepression t1_is9fbu5 wrote
You might need to put a disclaimer
Constant-Cranberry29 OP t1_is8w1w2 wrote
Reply to comment by WildConsideration783 in how to find out the problem when want to do testing the model? by Constant-Cranberry29
I already add dropout but still same and if I look from the model loss is not overfit
perfopt OP t1_is5q78g wrote
Reply to comment by kingfung1120 in Help regularization and dropout are hurting accuracy by perfopt
Got back to a totally crazy week at work. Finally got time to spend on my project. I think I need to simplify my inputs and give MFCC another try before jumping into CNNs
[deleted] t1_isk9fle wrote
Reply to Is it legal to use Rick and Morty for a research project? by felix_awesome
[removed]