Submitted by soraki_soladead t3_zmoxp7 in MachineLearning
soraki_soladead OP t1_j0desco wrote
Reply to comment by aps692 in [D] Trying to find paper about n-grams in early transformer layers by soraki_soladead
Reading through it now. It was on my reading list but it doesn’t look familiar.
Viewing a single comment thread. View all comments