Recent comments in /f/MachineLearning

CommunismDoesntWork t1_jdcxx5u wrote

>But in principle there's nothing standing in the way of building a 100B parameter SNN.

That's awesome. In that case, I'd pivot my research if I were you. These constrained optimization problems on limited hardware are fun and I'm sure they have some legitimate uses, but LLMs have proven that scale is king. Going in the opposite direction and trying to get SNNs to scale to billion of parameters might be world changing.

Because NNs are only going to get bigger and more costly to train. If SNNs and their accelerators can speed up training and ultimately reduce costs, that would be massive. You could be the first person in the world to create a billion parameter SNN. Once you show the world that it's possible, the flood gates will open.

0

1azytux OP t1_jdcvjki wrote

Yes, I have worked with multimodal models before, but I'm still in nascent stage of discovering the field of NLP. What about you? Are you interested in multimodal models? What's your PhD on?

I was interested in CoT, and more in multimodal ones because of the recent advances of chatgpt as it's able to remember the previous conversations. I hope this is correct.

Yes, I saw the link and wasn't able to find much about CoT in particular, so asked about you.

I can talk about what I've worked on and what I was trying and want to do in future, maybe in DMs .. ?

1

big_ol_tender t1_jdct16f wrote

I’d love to try this out but isn’t there an issue with licensing? OpenAI said you can’t use their model output to train competitors to chatgpt (which is total BS) and the alpaca dataset is all davinci output. I’m desperately trying to find some open source alternative that I can use for some experiments at work because I don’t want to give closedai any $.

1

Mental-Egg-2078 OP t1_jdcn0j7 wrote

Fair point, but at what point are these things accepted as providing reasonable assurance (for things like audits)?

I get that the idea of data not being right is scary, but once governing bodies let the tools in, there is no going back. But I agree in situations where reasonable assurance is not acceptable then sure you don't want a predictive machine making critical choices.

1

GaryS2000 t1_jdcd6xq wrote

Yeah the csv file has three columns separated into emotion, pixels, and usage. Emotion corresponds to the labels whereas usage corresponds to training/test/val, and the pixels column is made up of all of the pixel values used to make the image. It seems to produce much quicker training times than using the images, which is my main reason for wanting to use it. Training on .csv takes around 10 seconds per epoch whereas images take 10 minutes or so.

They both produce the same result, a trained model which can make predictions on facial expressions, however its felt weird throughout the entire process that the model trains so quick, you know? I've been led to believe that machine learning is an extremely time intensive process but for me it hasn't took long at all, so I was wondering if there's some fundamental error with using the .csv data instead of the images. Hopefully it should be fine though, I don't see the issue myself if it produces the same result.

1

andrew21w t1_jdcb0vo wrote

Why nobody uses polynomials as activation functions?

My mere perception is that polynomials are the best since they can approximate nearly any kind of function you like? So they're perfect....

But why aren't they used?

2

FrereKhan OP t1_jdc5qes wrote

This manuscript describes the details for neuron and synapse simulations for a mixed-signal neuromorphic spiking NN device, as well as the training, quantisation and deployment pipeline.

The idea is to build SNN applications, trained using gradient methods, that are robust against the mismatch exhibited by mixed-signal devices in general. By including a detailed trainable simulation of the neuron and synapse models, as well as trainable hardware-verified parameter mismatch models, you can perform backprop training of SNNs that are still functional when deployed to hardware, without per-chip calibration or tweaking.

Previously it was very difficult to build functioning SNNs for these devices, requiring lots of hand-tweaking and/or device calibration. With these new tools the aim is to train once, then deploy at scale to many chips, with some guarantees about performance degradation.

We've integrated these tools into the open-source deep SNN library Rockpool (https://rockpool.ai).


Manuscript abstract:

Mixed-signal neuromorphic processors provide extremely low-power operation for edge inference workloads, taking advantage of sparse asynchronous computation within Spiking Neural Networks (SNNs). However, deploying robust applications to these devices is complicated by limited controllability over analog hardware parameters, unintended parameter and dynamics variations of analog circuits due to fabrication non-idealities. Here we demonstrate a novel methodology for offline training and deployment of spiking neural networks (SNNs) to the mixed-signal neuromorphic processor Dynap-SE2. The methodology utilizes an unsupervised weight quantization method to optimize the network's parameters, coupled with adversarial parameter noise injection during training. The optimized network is shown to be robust to the effects of quantization and device mismatch, making the method a promising candidate for real-world applications with hardware constraints. This work extends Rockpool, an open-source deep-learning library for SNNs, with support accurate simulation of mixed-signal SNN dynamics. Our approach simplifies the development and deployment process for the neuromorphic community, making mixed-signal neuromorphic processors more accessible to researchers and developers.

2