Art10001 t1_jdd0ihw wrote on March 23, 2023 at 2:55 PM

Reply to [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan

Great. Neuromorphic technology is genius (or at least cool) and very underappreciated.

Art10001 t1_jdd0ag1 wrote on March 23, 2023 at 2:53 PM

Reply to comment by FrereKhan in [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan

Brainchip has 1 million neurons already. Loihi and Loihi2 similar.

CommunismDoesntWork t1_jdcxx5u wrote on March 23, 2023 at 2:38 PM

Reply to comment by FrereKhan in [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan

>But in principle there's nothing standing in the way of building a 100B parameter SNN.

That's awesome. In that case, I'd pivot my research if I were you. These constrained optimization problems on limited hardware are fun and I'm sure they have some legitimate uses, but LLMs have proven that scale is king. Going in the opposite direction and trying to get SNNs to scale to billion of parameters might be world changing.

Because NNs are only going to get bigger and more costly to train. If SNNs and their accelerators can speed up training and ultimately reduce costs, that would be massive. You could be the first person in the world to create a billion parameter SNN. Once you show the world that it's possible, the flood gates will open.

nokpil t1_jdcxrhc wrote on March 23, 2023 at 2:37 PM

Reply to [D] ICML 2023 Reviewer-Author Discussion by zy415

Surprisingly, one of my reviewers actually responded to the answer and raised their score from 5 to 6. Hope this will help someone.

FrereKhan OP t1_jdcvobu wrote on March 23, 2023 at 2:22 PM

Reply to comment by CommunismDoesntWork in [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan

Sort of yes; Xylo is a general-purpose SNN accelerator, but the scale is for smaller problems, in the order of 1000 neurons.

But in principle there's nothing standing in the way of building a 100B parameter SNN.

1azytux OP t1_jdcvjki wrote on March 23, 2023 at 2:22 PM

Reply to comment by aozorahime in Recent advances in multimodal models: What are your thoughts on chain of thoughts models? [D] by 1azytux

Yes, I have worked with multimodal models before, but I'm still in nascent stage of discovering the field of NLP. What about you? Are you interested in multimodal models? What's your PhD on?

I was interested in CoT, and more in multimodal ones because of the recent advances of chatgpt as it's able to remember the previous conversations. I hope this is correct.

Yes, I saw the link and wasn't able to find much about CoT in particular, so asked about you.

I can talk about what I've worked on and what I was trying and want to do in future, maybe in DMs .. ?

localhost80 t1_jdct42q wrote on March 23, 2023 at 2:05 PM

Reply to comment by Different_Prune_3529 in [P] Open-source GPT4 & LangChain Chatbot for large PDF docs by radi-cho

It will have better performance relative to the knowledge in the documents. It's the comparison of GPT-4 with global knowledge vs GPT-4 with local knowledge.

big_ol_tender t1_jdct16f wrote on March 23, 2023 at 2:04 PM

Reply to [P] ChatLLaMA - A ChatGPT style chatbot for Facebook's LLaMA by imgonnarelph

I’d love to try this out but isn’t there an issue with licensing? OpenAI said you can’t use their model output to train competitors to chatgpt (which is total BS) and the alpaca dataset is all davinci output. I’m desperately trying to find some open source alternative that I can use for some experiments at work because I don’t want to give closedai any $.

localhost80 t1_jdcrfd0 wrote on March 23, 2023 at 1:53 PM

Reply to comment by _Arsenie_Boca_ in [P] Open-source GPT4 & LangChain Chatbot for large PDF docs by radi-cho

GPT charges per token so it depends on the length of the document

Character_Internet_3 t1_jdcqsa1 wrote on March 23, 2023 at 1:48 PM

Reply to [R] Introducing SIFT: A New Family of Sparse Iso-FLOP Transformations to Improve the Accuracy of Computer Vision and Language Models by CS-fan-101

Why sift, is a computer vision reserved name

ambient_temp_xeno t1_jdcpvhv wrote on March 23, 2023 at 1:41 PM

Reply to comment by ambient_temp_xeno in [D] Running an LLM on "low" compute power machines? by Qwillbehr

*turns out WSL2 uses half your ram size by default. **13b seems to be weirdly not much better/possibly worse by some accounts anyway.

CommunismDoesntWork t1_jdcpcv9 wrote on March 23, 2023 at 1:38 PM

Reply to comment by FrereKhan in [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan

Are those chips general purpose SNN accelerators in the same way GPUs are general purpose NN accelerators? If so, what's stopping someone from creating a 100B parameter SNN similar to LLMs?

brain_diarrhea t1_jdcovin wrote on March 23, 2023 at 1:34 PM

Reply to [R] Introducing SIFT: A New Family of Sparse Iso-FLOP Transformations to Improve the Accuracy of Computer Vision and Language Models by CS-fan-101

Someone's getting a cease and desist

dancingnightly t1_jdcnhuh wrote on March 23, 2023 at 1:24 PM

Reply to [P] Open-source GPT4 & LangChain Chatbot for large PDF docs by radi-cho

Will you add semantic chunking?

FrereKhan OP t1_jdcn3b6 wrote on March 23, 2023 at 1:21 PM

Reply to comment by CommunismDoesntWork in [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan

Yes, a few options. Rockpool is designed to work with SNN chips from SynSense (https://synsense.ai ). Intel has Loihi, there is also Akida from BrainChip…

Mental-Egg-2078 OP t1_jdcn0j7 wrote on March 23, 2023 at 1:20 PM

Reply to comment by breadbrix in GPT-4 For SQL Schema Generation + Unstructured Feature Extraction [D] by Mental-Egg-2078

Fair point, but at what point are these things accepted as providing reasonable assurance (for things like audits)?

I get that the idea of data not being right is scary, but once governing bodies let the tools in, there is no going back. But I agree in situations where reasonable assurance is not acceptable then sure you don't want a predictive machine making critical choices.

CommunismDoesntWork t1_jdcloqz wrote on March 23, 2023 at 1:10 PM

Reply to comment by FrereKhan in [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan

Is there specialized hardware for SNNs yet?

mmyjona t1_jdceex2 wrote on March 23, 2023 at 12:08 PM

Reply to comment by Straight-Comb-6956 in [Project] Alpaca-30B: Facebook's 30b parameter LLaMa fine-tuned on the Alpaca dataset by imgonnarelph

no, llama-mps use ane.

GaryS2000 t1_jdcd6xq wrote on March 23, 2023 at 11:56 AM

Reply to comment by fnordstar in [D] Simple Questions Thread by AutoModerator

Yeah the csv file has three columns separated into emotion, pixels, and usage. Emotion corresponds to the labels whereas usage corresponds to training/test/val, and the pixels column is made up of all of the pixel values used to make the image. It seems to produce much quicker training times than using the images, which is my main reason for wanting to use it. Training on .csv takes around 10 seconds per epoch whereas images take 10 minutes or so.

They both produce the same result, a trained model which can make predictions on facial expressions, however its felt weird throughout the entire process that the model trains so quick, you know? I've been led to believe that machine learning is an extremely time intensive process but for me it hasn't took long at all, so I was wondering if there's some fundamental error with using the .csv data instead of the images. Hopefully it should be fine though, I don't see the issue myself if it produces the same result.

Different_Prune_3529 t1_jdccrmr wrote on March 23, 2023 at 11:52 AM

Reply to [P] Open-source GPT4 & LangChain Chatbot for large PDF docs by radi-cho

Can it have good performance as openai’s GPT?

andrew21w t1_jdcb0vo wrote on March 23, 2023 at 11:35 AM

Reply to [D] Simple Questions Thread by AutoModerator

Why nobody uses polynomials as activation functions?

My mere perception is that polynomials are the best since they can approximate nearly any kind of function you like? So they're perfect....

But why aren't they used?

fnordstar t1_jdcakhx wrote on March 23, 2023 at 11:30 AM

Reply to comment by GaryS2000 in [D] Simple Questions Thread by AutoModerator

Ohh ok wouldn't have thought someone would put pjxel data in a CSV.

GaryS2000 t1_jdc74v5 wrote on March 23, 2023 at 10:52 AM

Reply to comment by fnordstar in [D] Simple Questions Thread by AutoModerator

Like I said the .csv data. Its the same data as the image dataset with one of thr columns containing the pixel values of the images, meaning it can reconstruct the image from the file.

[deleted] t1_jdc71bq wrote on March 23, 2023 at 10:51 AM

Reply to comment by fnordstar in [D] Simple Questions Thread by AutoModerator

[deleted]

FrereKhan OP t1_jdc5qes wrote on March 23, 2023 at 10:36 AM

Reply to [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan

This manuscript describes the details for neuron and synapse simulations for a mixed-signal neuromorphic spiking NN device, as well as the training, quantisation and deployment pipeline.

The idea is to build SNN applications, trained using gradient methods, that are robust against the mismatch exhibited by mixed-signal devices in general. By including a detailed trainable simulation of the neuron and synapse models, as well as trainable hardware-verified parameter mismatch models, you can perform backprop training of SNNs that are still functional when deployed to hardware, without per-chip calibration or tweaking.

Previously it was very difficult to build functioning SNNs for these devices, requiring lots of hand-tweaking and/or device calibration. With these new tools the aim is to train once, then deploy at scale to many chips, with some guarantees about performance degradation.

We've integrated these tools into the open-source deep SNN library Rockpool (https://rockpool.ai).

Manuscript abstract:

Mixed-signal neuromorphic processors provide extremely low-power operation for edge inference workloads, taking advantage of sparse asynchronous computation within Spiking Neural Networks (SNNs). However, deploying robust applications to these devices is complicated by limited controllability over analog hardware parameters, unintended parameter and dynamics variations of analog circuits due to fabrication non-idealities. Here we demonstrate a novel methodology for offline training and deployment of spiking neural networks (SNNs) to the mixed-signal neuromorphic processor Dynap-SE2. The methodology utilizes an unsupervised weight quantization method to optimize the network's parameters, coupled with adversarial parameter noise injection during training. The optimized network is shown to be robust to the effects of quantization and device mismatch, making the method a promising candidate for real-world applications with hardware constraints. This work extends Rockpool, an open-source deep-learning library for SNNs, with support accurate simulation of mixed-signal SNN dynamics. Our approach simplifies the development and deployment process for the neuromorphic community, making mixed-signal neuromorphic processors more accessible to researchers and developers.

Recent comments in /f/MachineLearning