Anenome5 t1_j9n56lk wrote
We learned that you can get the same result from less parameters and more training. It's a tradeoff thing, so I'm not entirely surprised. We cannot assume that GPT's approach is the most efficient one out there, if anything it's just brute force effectiveness and we should desperately hope that the same or better results can be achieved with much less hardware ultimately. And so far it appears that this is true and is the case.
NoidoDev t1_j9nelfr wrote
>same result from less parameters and more training
Thanks, good to know.
Viewing a single comment thread. View all comments