Submitted by __Maximum__ t3_11l3as6 in MachineLearning
cztomsik t1_jbgdoar wrote
Reply to comment by currentscurrents in [D] Can someone explain the discrepancy between the findings of LLaMA and Chinchilla? by __Maximum__
but this is likely going to take forever because of LR decay, right?
Viewing a single comment thread. View all comments