Recent comments in /f/deeplearning
eternal-abyss-77 OP t1_iwfpfz7 wrote
Reply to comment by arhetorical in Can someone explain me the math behind this paper and tell me whether the way I have understood this paper is right or not? by eternal-abyss-77
Sir, firstly thanks for responding.
I already have implemented this as a working program. But now I am enhancing it, and I have some feeling that I somehow am missing something from the paper, and not understanding it properly.
For example:
The equations [2, 4, 6, 8, 10, 15-18] in pages 3, 4 and 5
The training of the model with generated features with Linear LSE mentioned in page 6-7
And section B Local pixel difference descriptor, para 2 regarding directions. And it's related figures, Figure 3(a,b).
If you can explain these things, i can effectively understand your explanation and ask my doubts wrt my present work on this paper, with code.
arhetorical t1_iwfoxgk wrote
Reply to Can someone explain me the math behind this paper and tell me whether the way I have understood this paper is right or not? by eternal-abyss-77
Did you mean to post an explanation of your understanding of the paper?
eternal-abyss-77 OP t1_iwfn0qt wrote
Reply to comment by sEi_ in Can someone explain me the math behind this paper and tell me whether the way I have understood this paper is right or not? by eternal-abyss-77
Check Now
sEi_ t1_iwfmwgh wrote
Reply to Can someone explain me the math behind this paper and tell me whether the way I have understood this paper is right or not? by eternal-abyss-77
Bad link. FFS ALWAYS check if the links you post anywhere works! - I'm senior webdev and know the importance of always checking posted links. Learned the hard way.
I can with wizardry deduct the right url from the bad double url, but would be better if the link is corrected so anyone can view the link you posted.
EDIT: (If) you can not edit the OP then delete it and make new better one.
EDIT2: I need to register to read the paper!
sckuzzle t1_iwaia67 wrote
Reply to comment by HMasterSunday in Making a model predict on the basis of a particular value by ole72444
> As per your other point though, my code does account for that already
You may try running it? It returns [3.0, 8.0, 12.0, 8.0]. The intended output is [False, False, True, False]. OP didn't ask for it to be split into groups of four, they asked for every fourth value to be taken.
ole72444 OP t1_iwa5ljr wrote
Reply to comment by ContributionWild5778 in Making a model predict on the basis of a particular value by ole72444
I'm trying to see if NNs can actually generalise such functions. I'm using the preprocessing that you've recommended to create the ground truth labels
ole72444 OP t1_iwa5adf wrote
Reply to comment by sckuzzle in Making a model predict on the basis of a particular value by ole72444
Yes, i understand it looks like a data preprocessing problem (and it actually is). But this is a toy example to demonstrate if NNs can actually generalise functions that are of this sophistication.
BugSlayerJohn t1_iwa32dc wrote
Reply to comment by Thijs-vW in Update an already trained neural network on new data by Thijs-vW
First of all, you don't want an identical or nearly identical weight matrix. You won't achieve that and you don't need to. In principle a well designed model should NOT make radically different predictions when retrained, particularly with the same data, even though the weight matrices will certainly differ at least a little and possibly a lot. The same model trained two different times on the same data with the same hyperparameters will generally converge to nearly identical behaviors, right down to which types of inputs the final model struggles with. If you have the original model, original data, and original hyperparameters, definitely don't be frightened to retrain a model.
If your use case requires you to be able to strongly reason about similarity of inference, you could filter your holdout set for the inputs that both models should accurately predict, run inference for that set against both models, and prepare a small report indicating the similarity of predictions. This should ordinarily be unnecessary, but since it sounds like achieving this similarity is a point of concern, this would allow you to measure it, if for no other purpose than to assuage fears. You should likely expect SOME drift in similarity, the different versions won't be identical, so if the similarity is not as high as you like consider manually reviewing a list of inputs that the two models gave different predictions for to confirm the rate at which the difference really is undesirable.
HMasterSunday t1_iw9qr8l wrote
Reply to comment by sckuzzle in Making a model predict on the basis of a particular value by ole72444
Interesting, I didn't try a test run to time both approaches, I'll do that more often. As per your other point though, my code does account for that already, the number of individual cuts is 1/4 of the length of the full array (len(input_array)/4) so it splits it up into arrays of length 4 anyways. That much I do know at least.
sckuzzle t1_iw9fe1i wrote
Reply to comment by HMasterSunday in Making a model predict on the basis of a particular value by ole72444
Writing "short" code isn't a always good thing. Yes your suggestion has less lines, but:
-
It takes ~6 times as long to run
-
It does not return the correct output (split does not take every nth value, but rather groups it into n groups)
I'm absolutely not claiming my code was optimized, but it did clearly show the steps required to calculate the necessary output, so it was easy to understand. Writing "short" code is much more difficult to understand what is happening, and often leads to a bug (as seen here). Also, depending on how you are doing it, it often takes longer to run (the way it was implemented requires it to do extra steps which aren't necessary).
[deleted] t1_iw9aoiv wrote
Reply to comment by HMasterSunday in Making a model predict on the basis of a particular value by ole72444
[deleted]
HMasterSunday t1_iw9allv wrote
Reply to comment by sckuzzle in Making a model predict on the basis of a particular value by ole72444
also: numpy.split can create several cuts off of a numpy array so it simplifies to:
import numpy as np def process_data(input_array): cut_array = numpy.split(input_array, (len(array)/4)) max_array =[ ] for cut in cut_array: max_array.append(max(cut)) return max_array
much shorter, used this method recently so it's on the front of my mind
edit: don't know how to format on here, sorry
ContributionWild5778 t1_iw99ccg wrote
Lots and lots of training my frand
ContributionWild5778 t1_iw98mo2 wrote
Reply to comment by Thijs-vW in Update an already trained neural network on new data by Thijs-vW
If you want to re-train the whole model with mixed dataset. The only option I can think of is transfer learning where you only initialise all the parameters in the same way which were used to train on the old dataset and re-train from 0th epoch
ContributionWild5778 t1_iw97xid wrote
Reply to comment by RichardBJ1 in Update an already trained neural network on new data by Thijs-vW
I believe that is an iterative process when doing transfer learning. First you will always freeze the top layers because low level feature extraction is done over there (extracting lines and contours). Unfreeze the last layers and try to train those layers only where high level features are extracted. At the same time it also depends on how different the new dataset is using which you are training the model. If it contains similar characteristics/features freezing top layers would be my choice
ContributionWild5778 t1_iw95dyj wrote
Reply to comment by sckuzzle in Making a model predict on the basis of a particular value by ole72444
Agreed to this. It's a data pre-processing step making a model do this would be a very complicated task.
FuB4R32 t1_iw8kirz wrote
Reply to comment by sckuzzle in Making a model predict on the basis of a particular value by ole72444
You could also do this at the input if its hard to edit the training data, e.g. in tensorflow https://www.tensorflow.org/api_docs/python/tf/gather
https://www.tensorflow.org/api_docs/python/tf/math/argmax
Generally, should look into custom operations like this to achieve what you want
sckuzzle t1_iw8jv72 wrote
Why are you using a "model" / MLPs at all for this? This is a strictly data processing problem with no model creation required.
Just process your data by throwing away 75% of it, then take the max, then check if each value is equal to the maximum.
Something like (python):
import numpy as np
def process_data(input_array):
every_fourth = []
for i in range(len(input_array)):
if (i+1)%4==0:
every_fourth.append(input_array[i])
max_value = max(every_fourth)
matching_values = (np.array(every_fourth) == max_value)
return matching_values
Emotional-Fox-4285 OP t1_iw74o6z wrote
Reply to comment by crisischris96 in In my deep NN with 3 layer, . In the second iteration of GD, The activation of Layer 1 and Layer 2 output all 0 due to ReLU as all the input are smaller than 0. And L3 output some value with high floating point which is opposite to first forward_ propagation . Is this how it should work ? by Emotional-Fox-4285
Yes ....But the NN couldn't work ...Can you check my code for me ?
crisischris96 t1_iw740rt wrote
Reply to In my deep NN with 3 layer, . In the second iteration of GD, The activation of Layer 1 and Layer 2 output all 0 due to ReLU as all the input are smaller than 0. And L3 output some value with high floating point which is opposite to first forward_ propagation . Is this how it should work ? by Emotional-Fox-4285
Have you tried a leaky relu?
RichardBJ1 t1_iw733qt wrote
Reply to comment by jobeta in Update an already trained neural network on new data by Thijs-vW
Yes …obviously freezing the only two layers would be asinine! There is a keras blog on it, I do not know why particular layers (TL;DR). It doesn’t say top and bottom that’s for sure. …I agree it would be nice to have method in the choice of layers to freeze rather than arbitrary. I guess visualising layer output might help choose if a small model, but I’ve never tried that. So I do have experience of trying transfer learning, but (apart from tutorials) no experience of success with transfer learning!
jobeta t1_iw7228e wrote
Reply to comment by RichardBJ1 in Update an already trained neural network on new data by Thijs-vW
It seems intuitive that if possible, fully retraining will yield the best results but it can be costly. I just find it surprising to arbitrarily freeze two layers. What if your model only has two layers anyways? Again I don’t have experience so just guessing
RichardBJ1 t1_iw71rpv wrote
Reply to comment by jobeta in Update an already trained neural network on new data by Thijs-vW
Good question; I do not have a source for that, have just heard colleagues saying that. Obviously the reason for freezing layers is that we are trying to avoid loosing all the information we have already gained. Should speed up further training by reducing parameter numbers etc. As to actually WHICH layers are best persevered I don’t know. When I have read on it, people typically say “it depends”. But actually my point was I have never found transfer learning to be terribly effective (apart from years ago when I ran a specific transfer learning tutorial!). In my models it only takes a few days to start from scratch and so this it what I do! Transfer learning obviously makes enormous sense if you are working with someone else’s extravagantly trained model and you may be don’t even have the data. But in my case I always do have all the data…
jobeta t1_iw701lx wrote
Reply to comment by RichardBJ1 in Update an already trained neural network on new data by Thijs-vW
Why freeze bottom and top layers?
eternal-abyss-77 OP t1_iwfpjtm wrote
Reply to comment by sEi_ in Can someone explain me the math behind this paper and tell me whether the way I have understood this paper is right or not? by eternal-abyss-77
Reply to edit 2 : that's why i sent u the previous link, which is from sci-hub.se