Recent comments in /f/MachineLearning

yaosio t1_je56tet wrote

There's a limit, otherwise you would be able to ask it to self-reflect on anything and always get a correct answer eventually. Finding out why it can't get the correct answer the first time would be incredibly useful. Finding out where the limits are and why is also incredibly useful.

1

ChuckSeven t1_je55o02 wrote

The Transformer is not a universal function approximator. This is simply shown by the fact that it cannot process arbitrary long input due to the finite context limitations.

Your conclusion is not at all obvious or likely given your facts. They seem to be in hindsight given the strong performance of large models.

It's hard to think of chatgpt as a very large transformer ... because we don't know how to think about very large transformers.

1

mkffl t1_je53xf1 wrote

What impact has gpt delivered except some interest from the general population about generative models - which is not insignificant? Not much, so there’s potentially a lot of work needed to turn it into something useful, and I would focus on this.

1

harharveryfunny t1_je50vw9 wrote

There's no indication that I've seen that it maintains any internal state from one word generated to the next. Therefore the only way it can build upon it's own "thoughts" is by generating "step-by-step" output which is fed back into it. It seems its own output is its only working memory, at least for now (GPT-4), although that's an obvious area for improvement.

7

WindForce02 t1_je4zh7m wrote

I don't know if IQ is exactly a good metric here because LLMs merely replicate training data so it would be likely that the training data (which is very big) contains information regarding IQ tests. It would be an indirect comparison because you'd be comparing sheer training data amount with a person's ability to produce thoughts. It would be way more interesting to give GPT4 complex situations that require advanced problem solving skills. Say you got a message that you need to decode and it has multiple layers of encryption and you only have a few hints on how you might go about it, since there's no way to replicate responses based on previous training data I'd be curious to see how far it gets, or let's say a hacking CTF, which is something that not only takes pure coding skill, but also a creative thought process.

1

alyflex t1_je4uq2y wrote

Another solution is to use a memory efficient neural network: https://arxiv.org/pdf/1905.10484.pdf With this type of neural network you can easily fit those size images into your neural network. However the problem with them is that they are very difficult to make (you manually have to code up the backpropagation). So depending on your math proficiency and ambitions this might just be too much.

1

alyflex t1_je4u0rr wrote

It really depends what you are intending to use this for. There are many sides to machine learning, but you don't have to know all of them. To name a few very different concepts:

MLOps (Corsera has an excellent series on this) Reinforcement learning GANs Graph neural networks

I would say that once you have an idea about what most of these topics involve it is time to actively dive into some of them by actually trying to code up solutions in them, or downloading well known github projects and trying to run them yourself.

1