boosandy OP t1_izht8ye wrote on December 9, 2022 at 5:09 AM

Reply to comment by Volhn in Graphics Card set up for deep learning by boosandy

Oh nice. Thank you.

Volhn t1_izhrkd8 wrote on December 9, 2022 at 4:53 AM

Reply to Graphics Card set up for deep learning by boosandy

Yes. You might have to specify which device to use though. You also can’t combine memory into one pool, but you can parallelize.

suflaj t1_izfw75x wrote on December 8, 2022 at 8:28 PM

Reply to comment by Remi_Coulom in What framework can I use to quantize a deep learning model to specific bit-widths? by MahmoudAbdAlghany

They do, but they use bigger registers, so ultimately, unless you can hand optimize it to pack operations together, you will have no benefit from it. That would at least imply writing your own CUDA kernels.

Furthermore, 8 bit is already often too small to be stable. Why go lower? If you want garbage outputs, you could always fit whatever task on a smaller model. It's easier to cut model size in half and use 8-bit or 4x and use 16-bit, than to make 4 bit or lower work.

At this point in time, TensorRT seems to be the best you'll get for as little involvement as possible. Based on benchmarks, it also seems to outperform INT4 precision by a significant margin. The only drawback is its license, which implicitly prevents commercial use.

Remi_Coulom t1_izfukyo wrote on December 8, 2022 at 8:18 PM

Reply to comment by suflaj in What framework can I use to quantize a deep learning model to specific bit-widths? by MahmoudAbdAlghany

NVIDIA's tensor cores support 4-bit, 2-bit and 1-bit operation. I am very surprised no popular library takes advantage of this possibility. Here is a 3-year-old blog post about using 4-bit inference: https://developer.nvidia.com/blog/int4-for-ai-inference/

suflaj t1_izfke61 wrote on December 8, 2022 at 7:12 PM

Reply to What framework can I use to quantize a deep learning model to specific bit-widths? by MahmoudAbdAlghany

There are none, unless you plan on emulating them, which you'd have to do yourself.

The available quantization widths correspond to what the hardware is capable of doing, and hardware generally revolves around widths that have bytes as their base length.

arhetorical t1_izdkn6q wrote on December 8, 2022 at 9:01 AM

Reply to comment by Reddituser2460155 in the biggest risk with generative AI is not its potential for misinformation but cringe. by hayAbhay

The prompt was something like "write the worst fanfic ever".

Dexamph t1_izd1dy7 wrote on December 8, 2022 at 5:03 AM

Reply to comment by computing_professor in GPU Comparisons: RTX 6000 ADA vs A100 80GB vs 2x 4090s by TheButteryNoodle

This is deadass wrong as that Puget statement was in the context of system memory, nothing to do with pooling: > How much RAM does machine learning and AI need?

>The first rule of thumb is to have at least double the amount of CPU memory as there is total GPU memory in the system. For example, a system with 2x GeForce RTX 3090 GPUs would have 48GB of total VRAM – so the system should be configured with 128GB (96GB would be double, but 128GB is usually the closest configurable amount).

Dexamph t1_izd0gyf wrote on December 8, 2022 at 4:54 AM

Reply to comment by LetMeGuessYourAlts in GPU Comparisons: RTX 6000 ADA vs A100 80GB vs 2x 4090s by TheButteryNoodle

Technically they all can because it relies on software, it's just that NVLink will reduce the performance penalty going between GPUs. There is no free lunch here so you damn well better know what you're doing to not get stung like this guy by speculative bullshit pushed by people who never actually had to make it work.

With that out of the way, it doesn't get any better than ex-mining 3090s that start at ~$600. Don't bother with anything older because if your problem requires model parallelisation, than your time and effort is probably worth more than the pittance you save in trying to get some old 2080Tis or 2070 Supers to keep up.

Reddituser2460155 t1_iz9mdw9 wrote on December 7, 2022 at 2:28 PM

Reply to comment by arhetorical in the biggest risk with generative AI is not its potential for misinformation but cringe. by hayAbhay

What is this?

hayAbhay t1_iz8xbk0 wrote on December 7, 2022 at 9:58 AM

Reply to comment by Accomplished-Bill-45 in Are currently state of art model for logical/common-sense reasoning all based on NLP(LLM)? by Accomplished-Bill-45

I'm not entirely sure what those different categorizations entail but they seem really an application of reasoning. At it's core, everything we do is based on logical reasoning. There are paradoxes but it's the best we have. Within this, there are three core categories

Deductive reasoning - This is the core of how we reason about this. If we know "If A, then B" as a "rule" and if we observe "A" then it follows that B is definitely true - Premise: A, A => B, Conclusion: B
Inductive reasoning - This is coming up with the rule itself as a means of observation i.e if you observe many different instances (you notice the grass gets wet after it rains everytime - observation) and you concur, that "if it rains, then grass gets wet" or "A => B"
Abductive reasoning - This is a sort of reverse reasoning where you observe something and "hypothesize" the cause. This is inherently uncertain and makes a lot of assumptions (closed world). So here, Premise: A=>B, B, Conclusion - A? (yes if closed world and no other rule exists that entail B, uncertain otherwise)

There are several variations of these as well. Everything you mentioned are really applications of these. Natural language is inherently uncertain and so is reality itself! The closest any natural language comes to capturing logic is legal documents (and we know the semantic games that happen there :) )

In terms of AI, logic based systems got pretty popular in the 80s but they're very brittle given our reality but they do have their place. This is the knowledge-based/logical reasoning you mentioned. Knowledge bases are simply a format in which "knowledge" or in other words some textual representation of real world concepts live and have a structure that you can apply logic based rules over.

With LLMs, they're probabilistic in a weird sort of way. Their optimization task is largely predicting the next word and essentially modeling the language underneath (which is inherently filled with ambiguities). Given large repetitions in text, it can easily do what appears to be reasoning largely from high probability occurrence. But, it won't be able to say, systematically pick concepts and trace up the reasoning like a human can. However, their biggest advantage is general utility. They can, as a single algorithm, solve a wide range of problems that would otherwise require a lot of bespoke systems. And, LLMs over the past 5-6 years have consistently hammered bespoke special purpose systems built from scratch. After all, for a human to apply crisp reasoning, they need some language :)

If you're curious, look up "Markov Logic Networks". Its from Pedro Domingos (his book "Master Algorithm" is also worth a read) and it tried to tie logic & probability too but had this intense expectation maximization over a combinatorial explosion. Also, check out yann lecunn talk at berkeley last month (he shared some of that at neurips from what i heard)

Accomplished-Bill-45 OP t1_iz8u38x wrote on December 7, 2022 at 9:08 AM

Reply to comment by sEi_ in Are currently state of art model for logical/common-sense reasoning all based on NLP(LLM)? by Accomplished-Bill-45

Yea,

I have tested it including

Social reasoning ( which does a good job)

Psychological reasoning ( bad)

Solving math question ( it’s ok, better then Minerva)

Asking LSAT logic game questions ( it gives its thought process, but failed to give correct answers)

I also wrote up a short mystery novel, ( like 200 words, with context) ask if it can tell is the victim is murdered or committed suicide. It actually did ok job on this one if the context is clearly given that everyone can deduce some conclusion using common sense.

sEi_ t1_iz8tt9a wrote on December 7, 2022 at 9:04 AM

Reply to Are currently state of art model for logical/common-sense reasoning all based on NLP(LLM)? by Accomplished-Bill-45

Have you asked ChadGPT the question? (no pun intended)

Accomplished-Bill-45 OP t1_iz8tp4b wrote on December 7, 2022 at 9:02 AM

Reply to comment by hayAbhay in Are currently state of art model for logical/common-sense reasoning all based on NLP(LLM)? by Accomplished-Bill-45

So I just found out that ppl tends to categorize the reasoning

logical reasoning

common sense reasoning

knowledge-based,

social reasoning,

psychological reasoning

qualitative reasoning ( solving some math problem)

So do you mean that If some needs to build a generalized model that can do all of above without specific fine tuning, LLM might be the most straightforward way. We can expecting them to do some simply reasoning like GPT

But to further improvement, can we use GPT as pre-trained model, and adding additional domain specific model ( mostly likely to using symbolic representation) to train.

But can symbolic AI alone perform all of above reasoning ? Can graphical model ( which my intuition tells me is some way representation of logical thought process) be incorporated into symbolic representation ?

hayAbhay t1_iz7vok7 wrote on December 7, 2022 at 2:54 AM

Reply to Are currently state of art model for logical/common-sense reasoning all based on NLP(LLM)? by Accomplished-Bill-45

It's important to note here that llms are NOT very good at reasoning but they are perhaps the best when you consider a "generic" algorithm i.e. without a lot of domain specific work.

For logical reasoning, you'll usually need to resort to symbolic representations underneath and apply the rules of logic. ChatGPT may appear to do that well especially with 1st and even 2nd order but longer chains will make it stumble.

MachinaDoctrina t1_iz6e7d1 wrote on December 6, 2022 at 8:31 PM

Reply to comment by Naive_Weird5939 in VAE USING PYTORCH by Naive_Weird5939

Literally the first one is applicable https://github.com/AntixK/PyTorch-VAE

MachinaDoctrina t1_iz6d7jv wrote on December 6, 2022 at 8:25 PM

Reply to comment by Naive_Weird5939 in VAE USING PYTORCH by Naive_Weird5939

It's an image database just use any of the convolutional variational autoencoders

Naive_Weird5939 OP t1_iz6d0xq wrote on December 6, 2022 at 8:24 PM

Reply to comment by MachinaDoctrina in VAE USING PYTORCH by Naive_Weird5939

Have gone through it the network used is different

Naive_Weird5939 OP t1_iz6cy7h wrote on December 6, 2022 at 8:23 PM

Reply to comment by MachinaDoctrina in VAE USING PYTORCH by Naive_Weird5939

It doesn’t have Vae using cifar10

MachinaDoctrina t1_iz6ckky wrote on December 6, 2022 at 8:21 PM

Reply to VAE USING PYTORCH by Naive_Weird5939

https://paperswithcode.com/paper/auto-encoding-variational-bayes

Naive_Weird5939 OP t1_iz56im7 wrote on December 6, 2022 at 3:50 PM

Reply to comment by arhetorical in VAE USING PYTORCH by Naive_Weird5939

no man it isnt working i have tried every way

gives out error

Another_mikem t1_iz56hge wrote on December 6, 2022 at 3:50 PM

Reply to the biggest risk with generative AI is not its potential for misinformation but cringe. by hayAbhay

I feel like at least 50% of that prompt was redundant. The response was pretty much on point for a standard LinkedIn article.

I would’ve been impressed if it had also told a story that didn’t really happen, but clumsily conveyed some moralizing point.

arhetorical t1_iz55woo wrote on December 6, 2022 at 3:46 PM

Reply to VAE USING PYTORCH by Naive_Weird5939

Can you not just change the dimensions to work with CIFAR10?

Final-Rush759 t1_iz4ib3v wrote on December 6, 2022 at 12:32 PM

Reply to How to choose a starting CNN architecture? by OffswitchToggle

Convnext is really good, and computation efficient.

Character-Act-9090 t1_iz3zs81 wrote on December 6, 2022 at 8:12 AM

Reply to How to choose a starting CNN architecture? by OffswitchToggle

You don't want to handcraft each layer but rather take a small-medium sized network architecture and apply it to your problem. If you then need extra performance you can easily go for a bigger architecture or fine tune restnet to your problem.

However, depending on the complexity of images and image quality a rather small network should work quite good already. I had a similar project in university and trained a really small network for the task with only a few hundred pictures with an accurracy of over 95%.

Classification tasks are usually the simplest of all and you seem to only have a few number of classes which makes it even easier. You don't need to use an architecture trained to classify poor images into thousands of different classes.

Work your way up and start with a simple example on the pytorch website and work your way up until you are satisfied with performance (Especially if you are a beginner).

kaarrrlll t1_iz3x6x4 wrote on December 6, 2022 at 7:35 AM

Reply to How to choose a starting CNN architecture? by OffswitchToggle

Most architectures are released following some painful engineering. So I also +1 others that suggest starting off with an architecture that made its name for classification task

Recent comments in /f/deeplearning