marcus_hk t1_jcrgwqm wrote on March 19, 2023 at 12:09 AM

Reply to [P] Web Stable Diffusion by crowwork

Just browsing on my phone and haven’t dug deep yet, but in the notebook it says that build.py targets M2 by default but can also target CUDA. What about CPU?

I’d love to see a super minimal example, like running a small nn.Linear layer, for pedagogical purposes and to abstract away the complexity of a larger model like Stable Diffusion.

simpleuserhere OP t1_jcrfjsh wrote on March 18, 2023 at 11:59 PM

Reply to comment by schorhr in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere

Thanks,What error are you getting? With Vs compiler and cmake we can easily build it.

simpleuserhere OP t1_jcreufr wrote on March 18, 2023 at 11:53 PM

Reply to comment by Pale-Dentist330 in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere

Hi, please check this branch https://github.com/rupeshs/alpaca.cpp/tree/linux-android-build-support

gkaykck t1_jcrel1c wrote on March 18, 2023 at 11:51 PM

Reply to comment by BalorNG in [R] ChatGLM-6B - an open source 6.2 billion parameter Eng/Chinese bilingual LLM trained on 1T tokens, supplemented by supervised fine-tuning, feedback bootstrap, and RLHF. Runs on consumer grade GPUs by MysteryInc152

Not cool

kross00 t1_jcre2hi wrote on March 18, 2023 at 11:47 PM

Reply to [P] The next generation of Stanford Alpaca by [deleted]

I'm a newbie... but maybe take a look at this model: https://github.com/BlinkDL/RWKV-LM

marcus_hk t1_jcrdufd wrote on March 18, 2023 at 11:46 PM

Reply to comment by race2tb in [P] Web Stable Diffusion by crowwork

For weights, yes, and for inference. If you can decompose and distribute a model across enough nodes, then you can get meaningful compute out of CPUs too — for instance for tokenization and smaller models.

starstruckmon t1_jcrbf0m wrote on March 18, 2023 at 11:27 PM

Reply to comment by timedacorn369 in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere

You can see some benchmarks here

https://github.com/qwopqwop200/GPTQ-for-LLaMa

gronaninjan t1_jcr6746 wrote on March 18, 2023 at 10:47 PM

Reply to comment by CommunicationLocal78 in [R] ChatGLM-6B - an open source 6.2 billion parameter Eng/Chinese bilingual LLM trained on 1T tokens, supplemented by supervised fine-tuning, feedback bootstrap, and RLHF. Runs on consumer grade GPUs by MysteryInc152

Name one

Pale-Dentist330 t1_jcr3e9m wrote on March 18, 2023 at 10:26 PM

Reply to [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere

Can you add the steps here?

VelvetyPenus t1_jcr1usl wrote on March 18, 2023 at 10:14 PM

Reply to [D] LLama model 65B - pay per prompt by MBle

Wait two weeks, it will all be free.

[deleted] t1_jcr1g7n wrote on March 18, 2023 at 10:11 PM

Reply to comment by NotARedditUser3 in [R] ChatGLM-6B - an open source 6.2 billion parameter Eng/Chinese bilingual LLM trained on 1T tokens, supplemented by supervised fine-tuning, feedback bootstrap, and RLHF. Runs on consumer grade GPUs by MysteryInc152

[removed]

[deleted] t1_jcr18u4 wrote on March 18, 2023 at 10:10 PM

Reply to comment by timedacorn369 in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere

[deleted]

NotARedditUser3 t1_jcr0vsb wrote on March 18, 2023 at 10:07 PM

Reply to comment by CommunicationLocal78 in [R] ChatGLM-6B - an open source 6.2 billion parameter Eng/Chinese bilingual LLM trained on 1T tokens, supplemented by supervised fine-tuning, feedback bootstrap, and RLHF. Runs on consumer grade GPUs by MysteryInc152

You'd know exactly how you were wrong if those topics weren't forbidden and you'd actually heard about them

currentscurrents t1_jcqzjil wrote on March 18, 2023 at 9:57 PM

Reply to [D] LLama model 65B - pay per prompt by MBle

I haven't heard of anybody running LLama as a paid API service. I think doing so might violate the license terms against commercial use.

>(or any other) model

OpenAI has a ChatGPT API that costs pennies per request. Anthropic also recently announced one for their Claude language model but I have not tried it.

the320x200 t1_jcqxqrs wrote on March 18, 2023 at 9:43 PM

Reply to comment by CommunicationLocal78 in [R] ChatGLM-6B - an open source 6.2 billion parameter Eng/Chinese bilingual LLM trained on 1T tokens, supplemented by supervised fine-tuning, feedback bootstrap, and RLHF. Runs on consumer grade GPUs by MysteryInc152

Please... That's ridiculous. Name one historical event people in the west are afraid to even admit to knowing about in public.

schorhr t1_jcqwzek wrote on March 18, 2023 at 9:37 PM

Reply to comment by simpleuserhere in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere

That's amazing!

Thank you for that link. With my old laptop and slow internet connection I'm struggling downloading visual studio and getting everything to work. I do have weights but still figuring out why building fails. Is there any way to download a prebuilt version?

CommunicationLocal78 t1_jcqw9zq wrote on March 18, 2023 at 9:32 PM

Reply to comment by BalorNG in [R] ChatGLM-6B - an open source 6.2 billion parameter Eng/Chinese bilingual LLM trained on 1T tokens, supplemented by supervised fine-tuning, feedback bootstrap, and RLHF. Runs on consumer grade GPUs by MysteryInc152

There are a lot fewer forbidden topics in China than in the West.

theAbominablySlowMan t1_jcqlq7g wrote on March 18, 2023 at 8:15 PM

Reply to [D] Unit and Integration Testing for ML Pipelines by Fender6969

bash a big ole data set through as an integration test and call it a done job. in my experience, DS moves too fast for testing to be as effective as for SWEs (no matter how carefully I've written my tests, they've never lasted more than 12 months before becoming a nuisance that people started ignoring).

[deleted] t1_jcqj70f wrote on March 18, 2023 at 7:57 PM

Reply to comment by timedacorn369 in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere

[removed]

Sad-Comedian-711 t1_jcqgv1x wrote on March 18, 2023 at 7:40 PM

Reply to comment by super_deap in [D] PyTorch 2.0 Native Flash Attention 32k Context Window by super_deap

This approach has been shown to work. Longformer even provided a script that did this for you: https://github.com/allenai/longformer/blob/master/scripts/convert_model_to_long.ipynb

I think for flash attention you do not want to use Longformer's attention though, you want to use Big Bird's with specific block sizes or something like that.

BalorNG t1_jcqgc4x wrote on March 18, 2023 at 7:36 PM

Reply to [R] ChatGLM-6B - an open source 6.2 billion parameter Eng/Chinese bilingual LLM trained on 1T tokens, supplemented by supervised fine-tuning, feedback bootstrap, and RLHF. Runs on consumer grade GPUs by MysteryInc152

I has 6b parameters, but I bet it cannot answer what has happened on Tiananmen square in 1989 :3

timedacorn369 t1_jcqg4v6 wrote on March 18, 2023 at 7:35 PM

Reply to comment by simpleuserhere in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere

What is the performance hit with various levels of quantization??

spadel_ t1_jcqdxi4 wrote on March 18, 2023 at 7:20 PM

Reply to [D] To those of you who quit machine learning, what do you do now? by nopainnogain5

I went into a quant research position at a prop trading firm and am very happy about that decision. While unfortunately I have not been using any deep learning so far, it is a lot of stats and machine learning. Also there are some interesting applications of physics informed neural networks, which I want to look into at some point. It is definitely fun to work on problems now that just need to be solved in a creative way instead of continuously having to come up with new research ideas.

127-0-0-1_1 t1_jcqd8se wrote on March 18, 2023 at 7:15 PM

Reply to comment by KerfuffleV2 in [D] PyTorch 2.0 Native Flash Attention 32k Context Window by super_deap

It's not unlimited memory in a single run, which remains unchanged, but that doesn't seem super relevant to what people want (nothing wrong with multiple runs!). Think about a turing machine, or heck, yourself. A turing machine only has access to a single cell of memory at at time, and in practice, modern CPUs only have access to their registers directly. For long term storage, that goes into RAM, which is accessed on demand.

Similarly, your own memory is not large enough to contain all the information you'd need to complete most complex tasks. That's why you have to write things down and actively try to remember things.

While that uses OpenAI's embedding networks, like the autoregressive LLM itself, it's not like OpenAI has a monopoly on text embeddings by any means (far from it - embeddings have a very straightforward business use and are used in practically any major site you know for things like similarity queries).

While I think OP is overhyping the degree to which this is "infinite memory" yet, in a hypothetical turing machine formulation where the network can more proactively store and restore memory, it would allow for it to be, at least, turing complete.

light24bulbs t1_jcqco8i wrote on March 18, 2023 at 7:11 PM

Reply to comment by Prymu in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere

Old reddit does too

Recent comments in /f/MachineLearning