Recent comments in /f/deeplearning

Nerveregenerator t1_j16yfnw wrote

Write all the equations out one paper, then do one forward and backward pass on paper as well with a simple mlp. I believe bias can be easily incorporated using an extra 1 in the input and using an extra weight as the bias, so it’s updated the same as any other weight. Also learn the basics of matrix multiplication.

3

Nerveregenerator t1_j1572fl wrote

Deep learning is much different from typical programming topics in that it is composed of a large amount of mathematical and complex theoretical concepts that are not avoidable using a library. Getting the code to run is relatively easy, and the choice of library has mostly to do with deployment goals and utilizing existing implementations. When things aren't working, theres not really a compiler error as to what's wrong with the model/data pipeline, and deep theoretical knowledge comes into play.

1

GrumpyGeologist t1_j12yawz wrote

Train a GAN on the images of class A. The generator will draw samples from the distribution outlined by the images in class A. The discriminator measures the distance between given sample and this distribution. So once you finish training on class A, the critic will tell you whether or not a given image belongs to class A.

An alternative approach is to do self-supervised representation learning (like BYOL) and compare the projection distance between a pair of A and B images.

1

trajo123 t1_j1001dl wrote

Look into OOD (Out of distribution) sample detection. If you go down the auto-encoder route then this paper can give you some pointers: Detecting Out-of-distribution Samples via Variational Auto-encoder with Reliable Uncertainty Estimation

Please note that OOD sample detection is an open problem and active research topic.

9

sayoonarachu t1_j0zlw4i wrote

No. I was just using pandas (cpu) for simple quick regex and removing and replacing text rows. It was just for a hobby project. The data was scraped from Midjourney and Stable diffusion discord so there were millions of rows of duplicate prompts and poor quality prompts which I had pandas delete and in the end the number of unique rows with more than 50 characters amounted to about 700k which was then used to train gpt-neo 125m.

I didn't know about cudf. Thanks 😅

1

PredictorX1 t1_j0z7qtr wrote

This is known as one-class learning or one-class classification. You could try obtaining "background class" images (images similar to yours in resolution, overall brightness, ...) and training an ordinary classifier on the combination of the two. Obviously, the background class images cannot contain food, but searches for things unrelated to food ("nail", "dancer", "floor", "statue", ...) followed up by quick visual inspection should serve.

11

sayoonarachu t1_j0yn75o wrote

I've only starting learning DL a month ago so mostly have been doing simple ANN. But inferencing larger param NLP models, GANS, Diffusion models, etc is fine. It's no desktop 3090s or enterprise grade GPU but for a laptop it's by far the best on the market. For example, the largest Parque file I've cleaned in pandas was about 7 million rows and about 10gb in size of just text. It can run queries through it in a few seconds.

Guess it depends on what kind of data science or dl you're looking to do. The 3080s probably won't be able to fine tune something like BLOOM model but can fine tune stable diffusion models with enough optimizations.

For modeling in blender or procedural generation in something like Houdini, I haven't had issues. I've made procedurally generated 20km height maps in Houdini to export to Unreal Engine and was not a problem.

1