Recent comments in /f/deeplearning

suflaj t1_izfw75x wrote

They do, but they use bigger registers, so ultimately, unless you can hand optimize it to pack operations together, you will have no benefit from it. That would at least imply writing your own CUDA kernels.

Furthermore, 8 bit is already often too small to be stable. Why go lower? If you want garbage outputs, you could always fit whatever task on a smaller model. It's easier to cut model size in half and use 8-bit or 4x and use 16-bit, than to make 4 bit or lower work.

At this point in time, TensorRT seems to be the best you'll get for as little involvement as possible. Based on benchmarks, it also seems to outperform INT4 precision by a significant margin. The only drawback is its license, which implicitly prevents commercial use.

1

Dexamph t1_izd1dy7 wrote

This is deadass wrong as that Puget statement was in the context of system memory, nothing to do with pooling: > How much RAM does machine learning and AI need?

>The first rule of thumb is to have at least double the amount of CPU memory as there is total GPU memory in the system. For example, a system with 2x GeForce RTX 3090 GPUs would have 48GB of total VRAM – so the system should be configured with 128GB (96GB would be double, but 128GB is usually the closest configurable amount).

1

Dexamph t1_izd0gyf wrote

Technically they all can because it relies on software, it's just that NVLink will reduce the performance penalty going between GPUs. There is no free lunch here so you damn well better know what you're doing to not get stung like this guy by speculative bullshit pushed by people who never actually had to make it work.

With that out of the way, it doesn't get any better than ex-mining 3090s that start at ~$600. Don't bother with anything older because if your problem requires model parallelisation, than your time and effort is probably worth more than the pittance you save in trying to get some old 2080Tis or 2070 Supers to keep up.

1

hayAbhay t1_iz8xbk0 wrote

I'm not entirely sure what those different categorizations entail but they seem really an application of reasoning. At it's core, everything we do is based on logical reasoning. There are paradoxes but it's the best we have. Within this, there are three core categories

  1. Deductive reasoning - This is the core of how we reason about this. If we know "If A, then B" as a "rule" and if we observe "A" then it follows that B is definitely true - Premise: A, A => B, Conclusion: B
  2. Inductive reasoning - This is coming up with the rule itself as a means of observation i.e if you observe many different instances (you notice the grass gets wet after it rains everytime - observation) and you concur, that "if it rains, then grass gets wet" or "A => B"
  3. Abductive reasoning - This is a sort of reverse reasoning where you observe something and "hypothesize" the cause. This is inherently uncertain and makes a lot of assumptions (closed world). So here, Premise: A=>B, B, Conclusion - A? (yes if closed world and no other rule exists that entail B, uncertain otherwise)

There are several variations of these as well. Everything you mentioned are really applications of these. Natural language is inherently uncertain and so is reality itself! The closest any natural language comes to capturing logic is legal documents (and we know the semantic games that happen there :) )

In terms of AI, logic based systems got pretty popular in the 80s but they're very brittle given our reality but they do have their place. This is the knowledge-based/logical reasoning you mentioned. Knowledge bases are simply a format in which "knowledge" or in other words some textual representation of real world concepts live and have a structure that you can apply logic based rules over.

With LLMs, they're probabilistic in a weird sort of way. Their optimization task is largely predicting the next word and essentially modeling the language underneath (which is inherently filled with ambiguities). Given large repetitions in text, it can easily do what appears to be reasoning largely from high probability occurrence. But, it won't be able to say, systematically pick concepts and trace up the reasoning like a human can. However, their biggest advantage is general utility. They can, as a single algorithm, solve a wide range of problems that would otherwise require a lot of bespoke systems. And, LLMs over the past 5-6 years have consistently hammered bespoke special purpose systems built from scratch. After all, for a human to apply crisp reasoning, they need some language :)

If you're curious, look up "Markov Logic Networks". Its from Pedro Domingos (his book "Master Algorithm" is also worth a read) and it tried to tie logic & probability too but had this intense expectation maximization over a combinatorial explosion. Also, check out yann lecunn talk at berkeley last month (he shared some of that at neurips from what i heard)

5

Accomplished-Bill-45 OP t1_iz8u38x wrote

Yea,

I have tested it including

Social reasoning ( which does a good job)

Psychological reasoning ( bad)

Solving math question ( it’s ok, better then Minerva)

Asking LSAT logic game questions ( it gives its thought process, but failed to give correct answers)

I also wrote up a short mystery novel, ( like 200 words, with context) ask if it can tell is the victim is murdered or committed suicide. It actually did ok job on this one if the context is clearly given that everyone can deduce some conclusion using common sense.

1

Accomplished-Bill-45 OP t1_iz8tp4b wrote

So I just found out that ppl tends to categorize the reasoning

logical reasoning

common sense reasoning

knowledge-based,

social reasoning,

psychological reasoning

qualitative reasoning ( solving some math problem)

So do you mean that If some needs to build a generalized model that can do all of above without specific fine tuning, LLM might be the most straightforward way. We can expecting them to do some simply reasoning like GPT

But to further improvement, can we use GPT as pre-trained model, and adding additional domain specific model ( mostly likely to using symbolic representation) to train.

But can symbolic AI alone perform all of above reasoning ? Can graphical model ( which my intuition tells me is some way representation of logical thought process) be incorporated into symbolic representation ?

2

hayAbhay t1_iz7vok7 wrote

It's important to note here that llms are NOT very good at reasoning but they are perhaps the best when you consider a "generic" algorithm i.e. without a lot of domain specific work.

For logical reasoning, you'll usually need to resort to symbolic representations underneath and apply the rules of logic. ChatGPT may appear to do that well especially with 1st and even 2nd order but longer chains will make it stumble.

8

Character-Act-9090 t1_iz3zs81 wrote

You don't want to handcraft each layer but rather take a small-medium sized network architecture and apply it to your problem. If you then need extra performance you can easily go for a bigger architecture or fine tune restnet to your problem.

However, depending on the complexity of images and image quality a rather small network should work quite good already. I had a similar project in university and trained a really small network for the task with only a few hundred pictures with an accurracy of over 95%.

Classification tasks are usually the simplest of all and you seem to only have a few number of classes which makes it even easier. You don't need to use an architecture trained to classify poor images into thousands of different classes.

Work your way up and start with a simple example on the pytorch website and work your way up until you are satisfied with performance (Especially if you are a beginner).

1