Recent comments in /f/deeplearning
[deleted] t1_iu1t9y4 wrote
[deleted]
Internal-Diet-514 t1_itwdhg2 wrote
Reply to Binary segmentation with imbalanced data by jantonio78
To start I’d down sample the number of images that don’t have any mass in them (or upsample the ones with mass) for the training data while keeping an even balance in the test/ validation. Others have said above that the loss functions is better suited to see an even representation. This is an easy way to do it without writing a custom data loader and you can see if that’s the problem before diving deeper.
Yeinstein20 t1_ituv2bx wrote
Reply to Binary segmentation with imbalanced data by jantonio78
Could you give a few more details on what kind of images you have, what you are trying to segment, your model...? Are you calculating you dice score and dice loss on foreground and background? Its usually a good idea to calculate it on the foreground and if you have more than one foreground class take the mean. That should already help a lot with class imbalance. Also I would add cross entropy or focal loss in addition to dice loss, that's something I have found to work well in general. You can also modify your data loader such that it will oversample foreground during training (say you have a batch size of 2 and force that at least one image has foreground). It's probably also a good idea to find a good baseline to compare against so you get a better idea how your performance is.
Deep_Quarter t1_ittr5p4 wrote
Reply to Binary segmentation with imbalanced data by jantonio78
Hey, what you are trying is a form of sample weighting. It basically says data imbalance is the loss functions problem.
What you need to do is write a better data loader. Make sure that the imbalance is handled at the data loader by customising it to load batches that are balanced. Easier said than done i know, but this is where concepts like sampling and class weighting come in.
Second thing you can do is to train on a smaller resolution. A proper data pipeline paired with a good loss function like dice or tversky or focal loss can help you get a benchmark from which to improve on. Just search segmentation loss in github.
Lastly, you can reframe the problem to something simple like box regression or heatmap. This helps if the mask region is relatively larger or smaller compared to the input resolution.
ShadowStormDrift t1_itt8dx4 wrote
Reply to Binary segmentation with imbalanced data by jantonio78
Almost all my experience with Deep Learning in industry is people being given tiny datasets and expected to perform miracles upon them. This feels like one of those cases
pornthrowaway42069l t1_itsbufj wrote
Reply to Binary segmentation with imbalanced data by jantonio78
I'd try some baseline/simpler models on the same data and see how it performs. Maybe the model just can't do any better, that's always a good one to check before panicking.
You can also try to use K-means or DBSCAN or something like that, and try to get 2 clusters of results - see if those algos can segment your data better than your network. If so, maybe the network is set up incorrectly somehow, if not, maybe something funky happening to your data in pipeline.
rupert_ai t1_itjtkd8 wrote
Yes this is definitely possible, have a look at diffusion models where you pass the model an existing image (such as a cut-out image of a person) and a text prompt for the background (Eiffel tower). There are heaps of youtube vids displaying it
nutpeabutter t1_itjon0i wrote
nishu3210 OP t1_itirexc wrote
Reply to comment by sckuzzle in Use DALL-E or other models, GANs to generate images of a real person? by nishu3210
Deepfakes? Ah I get it, then there is no way to verify if I am using it for myself or using other peoples faces. That's kind of sad though I understand the reason behind it.
sckuzzle t1_itinkiq wrote
Reply to comment by nishu3210 in Use DALL-E or other models, GANs to generate images of a real person? by nishu3210
There's a reason why these image generation models don't let you generate real faces.
nishu3210 OP t1_itijco6 wrote
Reply to comment by Massless in Use DALL-E or other models, GANs to generate images of a real person? by nishu3210
Why is it sketchy? Maybe I didn't explain properly. Please ask me any question to clarify?
Massless t1_itigb0e wrote
Don’t know what it is, but this reads sketchy as fuck to me.
beingsubmitted t1_itht9o5 wrote
I don't see how you would train them that way - you can't use the output of a discriminator as the input of a generator - that wouldn't get you what you want. You could train them in parallel, one network and discriminator doing only b&w restoration, and the other doing only colorization.
The way images work and the eye (part of the science behind why jpeg is so useful) is that we're much more sensitive to luminance information than color information. You could take the output of colorized image in hsl color space and replace the luminance with that of the generated restored photo. Doing it this way, you could force the separation of two generators using only one discriminator, as well - one generator only affecting the hue and saturation of the final image, and the other only affecting the luminance.
That said, with the more recent breakthroughs, it seems that networks are proving more successful as generalists than specialists. For example, it's believed that whisper performs better on each language because it's trained on all languages, as counter-intuitive as it may seem.
kaarrrlll t1_itfyuqj wrote
Having two gans is not a problem. Although for different purpose it exists since a long time (cyclegan). What's important is that your loss and/or other constraints must be precise to make sure to avoid one gan learns both and other learns identity. It's also double the concerns for instability during training. Good luck!
manli29 OP t1_itfqo7q wrote
Reply to comment by Yeinstein20 in Two GAN's by manli29
Awesome thanks a lot
Yeinstein20 t1_itfpan2 wrote
manli29 OP t1_itflbff wrote
Reply to comment by TheRealSerdra in Two GAN's by manli29
One GAN for colorization and one GAN restoration. I want see if that works better than a single GAN that does both.
TheRealSerdra t1_itfiwui wrote
danielgafni t1_it97yro wrote
Reply to comment by redditnit21 in Testing Accuracy higher than Training Accuracy by redditnit21
Don’t remove it, it’s just how it works. There is nothing wrong with having a higher train loss if you are using dropout.
Ttttrrrroooowwww OP t1_it6n348 wrote
Reply to comment by suflaj in EMA / SWA / SAM by Ttttrrrroooowwww
Thanks a lot
I read PARS. Looks very interesting and is somewhat related to pseudo-label entropy minimization. Im thinking of going in a similar direction, a great tip.
suflaj t1_it6fwta wrote
Reply to comment by Ttttrrrroooowwww in EMA / SWA / SAM by Ttttrrrroooowwww
That is way too old. Here are a few papers:
Ttttrrrroooowwww OP t1_it6dz0l wrote
Reply to comment by suflaj in EMA / SWA / SAM by Ttttrrrroooowwww
Can you point me to the papers you reference?
Ive only come across 2019 papers about sample selection (assuming you mean data sampling)
suflaj t1_it686mk wrote
Reply to comment by Ttttrrrroooowwww in EMA / SWA / SAM by Ttttrrrroooowwww
This would depend on whether or not you believe newer noisy data is more important. I would not use it generally because it's not something you can guarantee on all data and would have to be theoretically confirmed beforehand, which might be impossible given a task.
If I wanted to reduce the noisiness of pseudo-labels I would not want to introduce additional biases on the data itself, so I'd rather do sample selection, which seems to be what the newest papers suggest to do. Weight averaging is introducing biases akin to what weight normalization techniques did, which were partially abandoned in favour of different approaches, ex. larger batch sizes, because they proved to be more robust and performant in practice as we got models more different than the ML baselines we based our findings on.
Now, if I wasn't aware of papers that came out this year, maybe I wouldn't be saying this. That's why I recommended you stick to newer papers, becuase problems are never really fully solved and newer solutions tend to make bigger strides than optimizing older ones.
suflaj t1_it66q7y wrote
Reply to comment by Lee8846 in EMA / SWA / SAM by Ttttrrrroooowwww
While it is true that the age of a method does not determine its value, the older a method is, the more likely the performance gains you get are surpassed by some other method or model.
Specifically I do not see why I would use any weight averaging over a better model or training technique.
> In this case, an ensemble of models might not help.
Because you'd just use a bigger batch size
suflaj t1_iu1yx11 wrote
Reply to Do companies actually care about their model's training/inference speed? by GPUaccelerated
I don't think upgrading is ever worth. It's easier to just scale horizontally, i.e. buy more hardware.
The hardware you do inference on for production is not bought anyways, it's mostly rented, so that doesn't matter. And if you are running models on an edge device you don't have much choice.