Anenome5 t1_j9n56lk wrote on February 23, 2023 at 3:58 AM

We learned that you can get the same result from less parameters and more training. It's a tradeoff thing, so I'm not entirely surprised. We cannot assume that GPT's approach is the most efficient one out there, if anything it's just brute force effectiveness and we should desperately hope that the same or better results can be achieved with much less hardware ultimately. And so far it appears that this is true and is the case.

NoidoDev t1_j9nelfr wrote on February 23, 2023 at 5:23 AM

>same result from less parameters and more training

Thanks, good to know.