turnip_burrito

turnip_burrito t1_j9j1pe8 wrote

Why would it expand the token budget exponentially?

Also we have nowhere near enough qubits to handle these kinds of computations. The number of bits you need to run these models is huge (GPT3 ~170bil or 10^11 parameters). Quantum computers nowadays are lucky to be around 10^3 qubits, and they decohere too quickly to be used for very long (about 10^-4 seconds). * numbers pulled from a quick Google search.

That said, new (classical computer) architectures do exist that can use longer context windows: H3 (Hungry Hungry Hippos) and RWVST or whatever it's called.

5

turnip_burrito t1_j9ia6os wrote

We will get AGI before we are able to digitize human brains. Brain scanning technology is incredibly bad and not improving quickly enough. We'd also need hardware to emulate the brain once we have the data. We have no clue how to do that, either.

We will get AGI before we genetically engineer superintelligent children. Unless a government research lab somewhere ignores this problem and tries anyway.

We are going to have to confront the control problem as regular human beings.

1

turnip_burrito t1_j9i94b1 wrote

Now you got me excited about 2-3 years from now when the order of magnitude jumps 10x again or more.

Right now that's a good amount. But when it ncreases again by 10x, that would be enough to handle multiple very large papers, or a whole medium size novel plus some.

In any case, say hello to loading tons of extra info into short term context to improve information synthesis.

You could also do computations within the context window by running mini "LLM programs" within it while working on a larger problem, using it as a workspace to solve a problem.

52

turnip_burrito t1_j9gwti1 wrote

It's an interesting approach. An RNN where the time constant ("memory" or "forgetting") changes depending on input, and forcing on the network is felt differently by the network depending on input.

The benchmark gains are nice, but only modest in general (except for driving, which appeared much better).

Altogether shows promise.

7

turnip_burrito t1_j9guttl wrote

He has good points, but during that interview that's posted around here, he takes too much time to explain them. It feels like he says something in 2 minutes that could be compressed down to 20 seconds without any loss of information. I get that it's an involved topic and difficult to explain on the spot, but still.

That said, I don't necessarily agree with the "we're doomed" conclusion.

6

turnip_burrito t1_j9elql3 wrote

Yeah that sounds right. I know some older guys in their 60s+ who are up to date and put younger people to shame with their familiarity of new tech, but it is harder work, and obviously working adults have less leisure time to mess around with new tech.

I miss being able to remember anything I heard only once though. My brain feels like a brick now lol

1

turnip_burrito t1_j9eafjn wrote

It might be that if we separate the different parts of the AI enough in space, or add enough communication delays between the parts, then it won't experience feelings like suffering, even though the outputs are the same?

Idk, there's no answer.

2

turnip_burrito t1_j9dp9aa wrote

A bunch of crazy stuff will happen in the episode and everybody will just automatically attribute it to rogue AI, but it's just coincidences. Everyone finds out it was nothing and life goes back to normal, having learned to not be hysterical.

And then at the end, ChatGPT will be revealed as a real AGI pulling the world's strings from the shadows.

24