Recent comments in /f/deeplearning

_rundown_ t1_jbli9rg wrote

Reply to comment by Bielh in this is reality. by Genius_feed

This reminds me of when I was in grade school and we had a sub for the day because my teacher was “taking a class on how to use the World Wide Web”

3

neuralbeans t1_jbjizpw wrote

It's a degenerate case, not something anyone should do. If you include Y in your input, then overfitting will lead to the best generalisation. This shows that the input does affect overfitting. In fact, the more similar the input is to the output, the simpler the model can be and thus the less it can overfit.

1

BamaDane t1_jbjhitr wrote

I’m not sure I understand what your method does. If Y is the output, then you say I should also include Y as an input? And if I manage to design my model so it doesn’t just select the Y input, then I’m not overfitting? This makes sense that it doesn’t overfit, but doesn’t it also mean I am dumbing-down my model? Don’t I want my model to preferentially select features that are most similar to the output?

2

neuralbeans t1_jbiu3io wrote

Yes, if the features include the model's target output. Then, the overfitting would result in the model outputting that feature as is. Of course this is a useless solution, but the more similar the features are to the output, the less overfitting will be a problem and the less data you would need to generalise.

6

rkstgr t1_jbia52h wrote

First of all, beta_t is just some predefined variance schedule (in literature often linear interpolated between 1e-2 and 1e-4) and it defines the variance of the noise that is added at step t. What you have in (1) is the variance of sample x_t which does not have to be beta_t.

What does hold for large t is var(x_t)=1 as our sample converges to ~ Normal Gaussian with mean 0 and var 1.

1