Recent comments in /f/MachineLearning
Art10001 t1_jdd0ag1 wrote
Reply to comment by FrereKhan in [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan
Brainchip has 1 million neurons already. Loihi and Loihi2 similar.
CommunismDoesntWork t1_jdcxx5u wrote
Reply to comment by FrereKhan in [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan
>But in principle there's nothing standing in the way of building a 100B parameter SNN.
That's awesome. In that case, I'd pivot my research if I were you. These constrained optimization problems on limited hardware are fun and I'm sure they have some legitimate uses, but LLMs have proven that scale is king. Going in the opposite direction and trying to get SNNs to scale to billion of parameters might be world changing.
Because NNs are only going to get bigger and more costly to train. If SNNs and their accelerators can speed up training and ultimately reduce costs, that would be massive. You could be the first person in the world to create a billion parameter SNN. Once you show the world that it's possible, the flood gates will open.
nokpil t1_jdcxrhc wrote
Reply to [D] ICML 2023 Reviewer-Author Discussion by zy415
Surprisingly, one of my reviewers actually responded to the answer and raised their score from 5 to 6. Hope this will help someone.
FrereKhan OP t1_jdcvobu wrote
Reply to comment by CommunismDoesntWork in [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan
Sort of yes; Xylo is a general-purpose SNN accelerator, but the scale is for smaller problems, in the order of 1000 neurons.
But in principle there's nothing standing in the way of building a 100B parameter SNN.
1azytux OP t1_jdcvjki wrote
Reply to comment by aozorahime in Recent advances in multimodal models: What are your thoughts on chain of thoughts models? [D] by 1azytux
Yes, I have worked with multimodal models before, but I'm still in nascent stage of discovering the field of NLP. What about you? Are you interested in multimodal models? What's your PhD on?
I was interested in CoT, and more in multimodal ones because of the recent advances of chatgpt as it's able to remember the previous conversations. I hope this is correct.
Yes, I saw the link and wasn't able to find much about CoT in particular, so asked about you.
I can talk about what I've worked on and what I was trying and want to do in future, maybe in DMs .. ?
localhost80 t1_jdct42q wrote
Reply to comment by Different_Prune_3529 in [P] Open-source GPT4 & LangChain Chatbot for large PDF docs by radi-cho
It will have better performance relative to the knowledge in the documents. It's the comparison of GPT-4 with global knowledge vs GPT-4 with local knowledge.
big_ol_tender t1_jdct16f wrote
I’d love to try this out but isn’t there an issue with licensing? OpenAI said you can’t use their model output to train competitors to chatgpt (which is total BS) and the alpaca dataset is all davinci output. I’m desperately trying to find some open source alternative that I can use for some experiments at work because I don’t want to give closedai any $.
localhost80 t1_jdcrfd0 wrote
Reply to comment by _Arsenie_Boca_ in [P] Open-source GPT4 & LangChain Chatbot for large PDF docs by radi-cho
GPT charges per token so it depends on the length of the document
Character_Internet_3 t1_jdcqsa1 wrote
Reply to [R] Introducing SIFT: A New Family of Sparse Iso-FLOP Transformations to Improve the Accuracy of Computer Vision and Language Models by CS-fan-101
Why sift, is a computer vision reserved name
ambient_temp_xeno t1_jdcpvhv wrote
Reply to comment by ambient_temp_xeno in [D] Running an LLM on "low" compute power machines? by Qwillbehr
*turns out WSL2 uses half your ram size by default. **13b seems to be weirdly not much better/possibly worse by some accounts anyway.
CommunismDoesntWork t1_jdcpcv9 wrote
Reply to comment by FrereKhan in [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan
Are those chips general purpose SNN accelerators in the same way GPUs are general purpose NN accelerators? If so, what's stopping someone from creating a 100B parameter SNN similar to LLMs?
brain_diarrhea t1_jdcovin wrote
Reply to [R] Introducing SIFT: A New Family of Sparse Iso-FLOP Transformations to Improve the Accuracy of Computer Vision and Language Models by CS-fan-101
Someone's getting a cease and desist
dancingnightly t1_jdcnhuh wrote
Will you add semantic chunking?
FrereKhan OP t1_jdcn3b6 wrote
Reply to comment by CommunismDoesntWork in [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan
Yes, a few options. Rockpool is designed to work with SNN chips from SynSense (https://synsense.ai ). Intel has Loihi, there is also Akida from BrainChip…
Mental-Egg-2078 OP t1_jdcn0j7 wrote
Reply to comment by breadbrix in GPT-4 For SQL Schema Generation + Unstructured Feature Extraction [D] by Mental-Egg-2078
Fair point, but at what point are these things accepted as providing reasonable assurance (for things like audits)?
I get that the idea of data not being right is scary, but once governing bodies let the tools in, there is no going back. But I agree in situations where reasonable assurance is not acceptable then sure you don't want a predictive machine making critical choices.
CommunismDoesntWork t1_jdcloqz wrote
Reply to comment by FrereKhan in [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan
Is there specialized hardware for SNNs yet?
mmyjona t1_jdceex2 wrote
Reply to comment by Straight-Comb-6956 in [Project] Alpaca-30B: Facebook's 30b parameter LLaMa fine-tuned on the Alpaca dataset by imgonnarelph
no, llama-mps use ane.
GaryS2000 t1_jdcd6xq wrote
Reply to comment by fnordstar in [D] Simple Questions Thread by AutoModerator
Yeah the csv file has three columns separated into emotion, pixels, and usage. Emotion corresponds to the labels whereas usage corresponds to training/test/val, and the pixels column is made up of all of the pixel values used to make the image. It seems to produce much quicker training times than using the images, which is my main reason for wanting to use it. Training on .csv takes around 10 seconds per epoch whereas images take 10 minutes or so.
They both produce the same result, a trained model which can make predictions on facial expressions, however its felt weird throughout the entire process that the model trains so quick, you know? I've been led to believe that machine learning is an extremely time intensive process but for me it hasn't took long at all, so I was wondering if there's some fundamental error with using the .csv data instead of the images. Hopefully it should be fine though, I don't see the issue myself if it produces the same result.
Different_Prune_3529 t1_jdccrmr wrote
Can it have good performance as openai’s GPT?
andrew21w t1_jdcb0vo wrote
Reply to [D] Simple Questions Thread by AutoModerator
Why nobody uses polynomials as activation functions?
My mere perception is that polynomials are the best since they can approximate nearly any kind of function you like? So they're perfect....
But why aren't they used?
fnordstar t1_jdcakhx wrote
Reply to comment by GaryS2000 in [D] Simple Questions Thread by AutoModerator
Ohh ok wouldn't have thought someone would put pjxel data in a CSV.
GaryS2000 t1_jdc74v5 wrote
Reply to comment by fnordstar in [D] Simple Questions Thread by AutoModerator
Like I said the .csv data. Its the same data as the image dataset with one of thr columns containing the pixel values of the images, meaning it can reconstruct the image from the file.
[deleted] t1_jdc71bq wrote
Reply to comment by fnordstar in [D] Simple Questions Thread by AutoModerator
[deleted]
FrereKhan OP t1_jdc5qes wrote
Reply to [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan
This manuscript describes the details for neuron and synapse simulations for a mixed-signal neuromorphic spiking NN device, as well as the training, quantisation and deployment pipeline.
The idea is to build SNN applications, trained using gradient methods, that are robust against the mismatch exhibited by mixed-signal devices in general. By including a detailed trainable simulation of the neuron and synapse models, as well as trainable hardware-verified parameter mismatch models, you can perform backprop training of SNNs that are still functional when deployed to hardware, without per-chip calibration or tweaking.
Previously it was very difficult to build functioning SNNs for these devices, requiring lots of hand-tweaking and/or device calibration. With these new tools the aim is to train once, then deploy at scale to many chips, with some guarantees about performance degradation.
We've integrated these tools into the open-source deep SNN library Rockpool (https://rockpool.ai).
Manuscript abstract:
Mixed-signal neuromorphic processors provide extremely low-power operation for edge inference workloads, taking advantage of sparse asynchronous computation within Spiking Neural Networks (SNNs). However, deploying robust applications to these devices is complicated by limited controllability over analog hardware parameters, unintended parameter and dynamics variations of analog circuits due to fabrication non-idealities. Here we demonstrate a novel methodology for offline training and deployment of spiking neural networks (SNNs) to the mixed-signal neuromorphic processor Dynap-SE2. The methodology utilizes an unsupervised weight quantization method to optimize the network's parameters, coupled with adversarial parameter noise injection during training. The optimized network is shown to be robust to the effects of quantization and device mismatch, making the method a promising candidate for real-world applications with hardware constraints. This work extends Rockpool, an open-source deep-learning library for SNNs, with support accurate simulation of mixed-signal SNN dynamics. Our approach simplifies the development and deployment process for the neuromorphic community, making mixed-signal neuromorphic processors more accessible to researchers and developers.
Art10001 t1_jdd0ihw wrote
Reply to [P] New toolchain to train robust spiking NNs for mixed-signal Neuromorphic chips by FrereKhan
Great. Neuromorphic technology is genius (or at least cool) and very underappreciated.