[P] Discord Chatbot for LLaMA 4-bit quantized that runs 13b in <9 GiB VRAM github.com Submitted by Amazing_Painter_7692 t3_11pmz69 on March 12, 2023 at 7:13 PM in MachineLearning 51 comments 320
Amazing_Painter_7692 OP t1_jbzov27 wrote on March 12, 2023 at 11:34 PM Reply to comment by stefanof93 in [P] Discord Chatbot for LLaMA 4-bit quantized that runs 13b in <9 GiB VRAM by Amazing_Painter_7692 https://github.com/qwopqwop200/GPTQ-for-LLaMa Performance is quite good. Permalink Parent 24
Viewing a single comment thread. View all comments