suflaj t1_iu1yx11 wrote on October 27, 2022 at 11:21 PM

Reply to Do companies actually care about their model's training/inference speed? by GPUaccelerated

I don't think upgrading is ever worth. It's easier to just scale horizontally, i.e. buy more hardware.

The hardware you do inference on for production is not bought anyways, it's mostly rented, so that doesn't matter. And if you are running models on an edge device you don't have much choice.

[deleted] t1_iu1t9y4 wrote on October 27, 2022 at 10:37 PM

Reply to Do companies actually care about their model's training/inference speed? by GPUaccelerated

[deleted]

Internal-Diet-514 t1_itwdhg2 wrote on October 26, 2022 at 8:03 PM

Reply to Binary segmentation with imbalanced data by jantonio78

To start I’d down sample the number of images that don’t have any mass in them (or upsample the ones with mass) for the training data while keeping an even balance in the test/ validation. Others have said above that the loss functions is better suited to see an even representation. This is an easy way to do it without writing a custom data loader and you can see if that’s the problem before diving deeper.

Yeinstein20 t1_ituv2bx wrote on October 26, 2022 at 2:09 PM

Reply to Binary segmentation with imbalanced data by jantonio78

Could you give a few more details on what kind of images you have, what you are trying to segment, your model...? Are you calculating you dice score and dice loss on foreground and background? Its usually a good idea to calculate it on the foreground and if you have more than one foreground class take the mean. That should already help a lot with class imbalance. Also I would add cross entropy or focal loss in addition to dice loss, that's something I have found to work well in general. You can also modify your data loader such that it will oversample foreground during training (say you have a batch size of 2 and force that at least one image has foreground). It's probably also a good idea to find a good baseline to compare against so you get a better idea how your performance is.

Deep_Quarter t1_ittr5p4 wrote on October 26, 2022 at 6:40 AM

Reply to Binary segmentation with imbalanced data by jantonio78

Hey, what you are trying is a form of sample weighting. It basically says data imbalance is the loss functions problem.

What you need to do is write a better data loader. Make sure that the imbalance is handled at the data loader by customising it to load batches that are balanced. Easier said than done i know, but this is where concepts like sampling and class weighting come in.

Second thing you can do is to train on a smaller resolution. A proper data pipeline paired with a good loss function like dice or tversky or focal loss can help you get a benchmark from which to improve on. Just search segmentation loss in github.

Lastly, you can reframe the problem to something simple like box regression or heatmap. This helps if the mask region is relatively larger or smaller compared to the input resolution.

ShadowStormDrift t1_itt8dx4 wrote on October 26, 2022 at 3:15 AM

Reply to Binary segmentation with imbalanced data by jantonio78

Almost all my experience with Deep Learning in industry is people being given tiny datasets and expected to perform miracles upon them. This feels like one of those cases

pornthrowaway42069l t1_itsbufj wrote on October 25, 2022 at 11:01 PM

Reply to Binary segmentation with imbalanced data by jantonio78

I'd try some baseline/simpler models on the same data and see how it performs. Maybe the model just can't do any better, that's always a good one to check before panicking.

You can also try to use K-means or DBSCAN or something like that, and try to get 2 clusters of results - see if those algos can segment your data better than your network. If so, maybe the network is set up incorrectly somehow, if not, maybe something funky happening to your data in pipeline.

rupert_ai t1_itjtkd8 wrote on October 24, 2022 at 4:09 AM

Reply to Use DALL-E or other models, GANs to generate images of a real person? by nishu3210

Yes this is definitely possible, have a look at diffusion models where you pass the model an existing image (such as a cut-out image of a person) and a text prompt for the background (Eiffel tower). There are heaps of youtube vids displaying it

nutpeabutter t1_itjon0i wrote on October 24, 2022 at 3:24 AM

Reply to Use DALL-E or other models, GANs to generate images of a real person? by nishu3210

Just going to put this and this here

^(this is for non nefarious purposes right)

nishu3210 OP t1_itirexc wrote on October 23, 2022 at 11:05 PM

Reply to comment by sckuzzle in Use DALL-E or other models, GANs to generate images of a real person? by nishu3210

Deepfakes? Ah I get it, then there is no way to verify if I am using it for myself or using other peoples faces. That's kind of sad though I understand the reason behind it.

sckuzzle t1_itinkiq wrote on October 23, 2022 at 10:37 PM

Reply to comment by nishu3210 in Use DALL-E or other models, GANs to generate images of a real person? by nishu3210

There's a reason why these image generation models don't let you generate real faces.

nishu3210 OP t1_itijco6 wrote on October 23, 2022 at 10:08 PM

Reply to comment by Massless in Use DALL-E or other models, GANs to generate images of a real person? by nishu3210

Why is it sketchy? Maybe I didn't explain properly. Please ask me any question to clarify?

Massless t1_itigb0e wrote on October 23, 2022 at 9:48 PM

Reply to Use DALL-E or other models, GANs to generate images of a real person? by nishu3210

Don’t know what it is, but this reads sketchy as fuck to me.

beingsubmitted t1_itht9o5 wrote on October 23, 2022 at 7:22 PM

Reply to Two GAN's by manli29

I don't see how you would train them that way - you can't use the output of a discriminator as the input of a generator - that wouldn't get you what you want. You could train them in parallel, one network and discriminator doing only b&w restoration, and the other doing only colorization.

The way images work and the eye (part of the science behind why jpeg is so useful) is that we're much more sensitive to luminance information than color information. You could take the output of colorized image in hsl color space and replace the luminance with that of the generated restored photo. Doing it this way, you could force the separation of two generators using only one discriminator, as well - one generator only affecting the hue and saturation of the final image, and the other only affecting the luminance.

That said, with the more recent breakthroughs, it seems that networks are proving more successful as generalists than specialists. For example, it's believed that whisper performs better on each language because it's trained on all languages, as counter-intuitive as it may seem.

kaarrrlll t1_itfyuqj wrote on October 23, 2022 at 10:53 AM

Reply to comment by manli29 in Two GAN's by manli29

Having two gans is not a problem. Although for different purpose it exists since a long time (cyclegan). What's important is that your loss and/or other constraints must be precise to make sure to avoid one gan learns both and other learns identity. It's also double the concerns for instability during training. Good luck!

manli29 OP t1_itfqo7q wrote on October 23, 2022 at 9:02 AM

Reply to comment by Yeinstein20 in Two GAN's by manli29

Awesome thanks a lot

Yeinstein20 t1_itfpan2 wrote on October 23, 2022 at 8:43 AM

Reply to Two GAN's by manli29

I feel like I've read a paper where they do something similar to this but I'm not completely sure. I'll try finding it.

Edit: maybe remind me of that in case I forget about it

manli29 OP t1_itflbff wrote on October 23, 2022 at 7:48 AM

Reply to comment by TheRealSerdra in Two GAN's by manli29

One GAN for colorization and one GAN restoration. I want see if that works better than a single GAN that does both.

TheRealSerdra t1_itfiwui wrote on October 23, 2022 at 7:15 AM

Reply to Two GAN's by manli29

What exactly do you want to do that requires two GANs? And are you planning on just chaining the generators?

danielgafni t1_it97yro wrote on October 21, 2022 at 9:21 PM

Reply to comment by redditnit21 in Testing Accuracy higher than Training Accuracy by redditnit21

Don’t remove it, it’s just how it works. There is nothing wrong with having a higher train loss if you are using dropout.

Ttttrrrroooowwww OP t1_it6n348 wrote on October 21, 2022 at 9:24 AM

Reply to comment by suflaj in EMA / SWA / SAM by Ttttrrrroooowwww

Thanks a lot

I read PARS. Looks very interesting and is somewhat related to pseudo-label entropy minimization. Im thinking of going in a similar direction, a great tip.

suflaj t1_it6fwta wrote on October 21, 2022 at 7:40 AM

Reply to comment by Ttttrrrroooowwww in EMA / SWA / SAM by Ttttrrrroooowwww

That is way too old. Here are a few papers:

https://arxiv.org/abs/2202.07136

https://arxiv.org/abs/2201.10836

Ttttrrrroooowwww OP t1_it6dz0l wrote on October 21, 2022 at 7:14 AM

Reply to comment by suflaj in EMA / SWA / SAM by Ttttrrrroooowwww

Can you point me to the papers you reference?

Ive only come across 2019 papers about sample selection (assuming you mean data sampling)

suflaj t1_it686mk wrote on October 21, 2022 at 6:00 AM

Reply to comment by Ttttrrrroooowwww in EMA / SWA / SAM by Ttttrrrroooowwww

This would depend on whether or not you believe newer noisy data is more important. I would not use it generally because it's not something you can guarantee on all data and would have to be theoretically confirmed beforehand, which might be impossible given a task.

If I wanted to reduce the noisiness of pseudo-labels I would not want to introduce additional biases on the data itself, so I'd rather do sample selection, which seems to be what the newest papers suggest to do. Weight averaging is introducing biases akin to what weight normalization techniques did, which were partially abandoned in favour of different approaches, ex. larger batch sizes, because they proved to be more robust and performant in practice as we got models more different than the ML baselines we based our findings on.

Now, if I wasn't aware of papers that came out this year, maybe I wouldn't be saying this. That's why I recommended you stick to newer papers, becuase problems are never really fully solved and newer solutions tend to make bigger strides than optimizing older ones.

suflaj t1_it66q7y wrote on October 21, 2022 at 5:43 AM

Reply to comment by Lee8846 in EMA / SWA / SAM by Ttttrrrroooowwww

While it is true that the age of a method does not determine its value, the older a method is, the more likely the performance gains you get are surpassed by some other method or model.

Specifically I do not see why I would use any weight averaging over a better model or training technique.

> In this case, an ensemble of models might not help.

Because you'd just use a bigger batch size

Recent comments in /f/deeplearning