Recent comments in /f/deeplearning
suflaj t1_j731s6u wrote
Reply to comment by Open-Dragonfly6825 in Why are FPGAs better than GPUs for deep learning? by Open-Dragonfly6825
I mean kernels in the sense of functions.
> Why wouldn't GPU parallelization make inference faster?
Because most DL models are deep, and not exactly wide. I've explained already, deep means a long serial chain. Not parallelizable outside of data parallelism, which doesn't speed up inference, and model parallelism (generally not implemented, and has heavy IO costs).
Wide models and how they become equivalent to deep ones are unexplored, although they are theoretically just as expressive.
Open-Dragonfly6825 OP t1_j72yyst wrote
Reply to comment by AzureNostalgia in Why are FPGAs better than GPUs for deep learning? by Open-Dragonfly6825
Could you elaborate on some of the points you make? I have read the opposite to what you say regarding the folliwng points:
- Many scientific works claim that FPGAs have similar or better power (energy) efficiency than GPUs in almost all applications.
- FPGAs are considered a good AI technology for embedded devices where low energy consumption is key. Deep Learning models can be trained somewhere else, using GPUs, and, theoretically, inference can be done on the embedded devices using the FPGAs, for good speed and energy efficiency. (Thus, FPGAs are supposedly well-suited for inference.)
- Modern high-end (data center) FPGAs target 300 MHz clock speeds as base speeds. It is not unusual for designs to achieve performances higher than 300 MHz. Not much higher, though, unless you highly optimize the design and use some complex tricks to boost the clock speeds.
The comparison you make about the largest FPGA being comparable only to small embedded GPUs is interesting. I might look more into that.
Open-Dragonfly6825 OP t1_j72s5ov wrote
Reply to comment by suflaj in Why are FPGAs better than GPUs for deep learning? by Open-Dragonfly6825
One question: what do you mean by "kernels" here? It is the CNN operation you do to the layers? (As I said, I am not familiar with Deep Learning, and "kernels" means another thing when talking about GPU and FPGA programming.)
I know about TPUs and I understand they are the "best solution" for deep learning. However, I did not mention them since I won't be working with them.
Why wouldn't GPU parallelization make inference faster? Isn't inference composed mainly of matrix multiplications as well? Maybe I don't understand very well how GPU training is performed and how it differs from inference.
Open-Dragonfly6825 OP t1_j72qtao wrote
Reply to comment by yannbouteiller in Why are FPGAs better than GPUs for deep learning? by Open-Dragonfly6825
That actually makes sense. FPGAs are very complex to program, even though the gap between software and hardware programming has been narrowed with High Level Synthesis (e.g. OpenCL). I can see how it is just easier to use a GPU that is simpler to program, or a TPU that already has compatible libraries built for that abstract the low level details.
However, FPGAs have been increasing in area and available resources in recent years. It is still not enough circuitry?
Open-Dragonfly6825 OP t1_j72pzlc wrote
Reply to comment by BellyDancerUrgot in Why are FPGAs better than GPUs for deep learning? by Open-Dragonfly6825
FPGAs are reconfigurable hardware accelerators. That is, you could theoretically "syntehthize" (implement) any digital circuit into an FPGA, given that the FPGA has a high enough amount of "resources".
This would let the user to deploy custom hardware solutions to virtually any application, which could be way more optimized than software solutions (including using GPUs).
You could implement tensor cores or a TPU using an FPGA. But, obviously, an ASIC is faster and more energy efficient than its equivalent FPGA implementation.
Linking to what you say, besides all the "this is just theory, in practice things are different" of FPGAs, programming GPUs with CUDA is way way easier than programming FPGAs as of today.
Open-Dragonfly6825 OP t1_j72ow1i wrote
Reply to comment by TheDailySpank in Why are FPGAs better than GPUs for deep learning? by Open-Dragonfly6825
It is definitely hard to get started with FPGAs. High Level Synthesis such as OpenCL has easen the effort in recent years, but it is still particularly... different than regular programming. Requires more thoughtfulness, I would say.
Open-Dragonfly6825 OP t1_j72om7m wrote
Reply to comment by alex_bababu in Why are FPGAs better than GPUs for deep learning? by Open-Dragonfly6825
Maybe I missed it, but the posts I read don't specify that. Some scientific works claim that FPGAs are better than GPUs both for training and inference.
Why would you say they are better only for inference? Wouldn't a GPU be faster for inference too? Or is it just that inference doesn't require high speeds and FPGAs are for their energy efficiency?
zaphodakaphil t1_j71qup9 wrote
This post reminded me of an old article: https://www.damninteresting.com/on-the-origin-of-circuits/
AzureNostalgia t1_j7199dd wrote
Don't listen to anyone saying FPGAs are better than GPUs in AI. They don't know the platforms well enough.
FPGAs are obsolete for AI (training AND inference) and there are many reasons for that. Less parallelism, less power efficiency, no scaling, they run at like 300Mhz at best, they don't have the ecosystem and support GPUs have (i.e. support for models and layers). Even the reduced precision "advantage" they had it is now gone a long time ago. GPUs can do 8bit and even FP8 now. Maybe the largest FPGA (for example a Xilinx Alveo card) can be compared with a small embedded Jetson Xavier in AI. (you can compare the performance results from each company to see yourself).
Wonder why there are no FPGAs in MLPerf? (an AI benchmark which became the standard). Yeah you guess it right. Even Xilinx realized how bad FPGAs are for AI and stopped their production for this reason. They created the new Versal series which are not even FPGAs, they are more like GPUs (specifically they work like Nvidia Tensor cores for AI).
To sum up, FPGAs are worse in everything when compared with GPUs. Throughput, latency, power efficiency, performance/cost, you name it. Simple as that.
alex_bababu t1_j716j6g wrote
Do the say FPGAs are better for training or for infering already trained models?
Training I don't know. Inference I would say FPGAs are better
TheDailySpank t1_j70szn3 wrote
I tried to get I to FPGAs back in the day but found the hardware/software/über nerd level of knowledge to be way out of my league. Dove into the whole IP thing/level of logic and found it way above my level of autism.
yannbouteiller t1_j70o6y3 wrote
FPGAs are theoretically better than GPUs to deploy Deep Learning models simply because they are theoretically better than anything at doing anything. In practice, though, you never have enough circuitry on an FPGA to efficiently deploy a large model, and they are not targetted by the main Deep Learning libraries so you have to do the whole thing by hand including quantizing your model, extracting its weights, coding each layer in embedded C/VHDL/etc, and doing most of the hardware optimization by hand. It is tedious enough for preferring plug-and-play solutions like GPUs/TPUs in most cases, including embedded systems.
BellyDancerUrgot t1_j6zyiqm wrote
I’ll be honest, I don’t really know what FPGAs (I reckon they are an ASIC for matrix operations?) do and how they do it but tensor cores already provide optimization for matrix / tensor operations and fp16 and mixed precision has been available for quite a few years now. Ada and hopper even enable insane performance improvements for fp8 operations. Is there any real verifiable benchmark that compares training and inference time of the two?
On top of that there’s the obvious Cuda monopoly that nvidia has a tight leash on. Without software even the best hardware is useless and almost everything is optimized to run on Cuda backend.
suflaj t1_j6zq1k9 wrote
Well one reason I could think of why is custom kernels. To really get the most out of your model performance, you will likely be optimizing the kernels you use for your layers, sometimes fusing them. A GPU can't adapt to that as well. The best you can do is use TensorRT to optimize for a speficic model of GPU, but why do that when you can create ex. the optimal CNN kernel in hardware on an FPGA? On a GPU you can only work with the hardware that came with the GPU.
That being said, this is in regard to processing, not necessarily scaling it up. And maybe it makes sense for inference, where it would be nice making a processor that is made specifically to run some architecture and which doesn't necessarily process things in large batches.
But for training, obviously nothing is going to beat a GPU/TPU cluster because of pricing and seemingly infinite scaling of GPUs. If money is not a problem you can always just buy more GPUs and your training will be faster. But parallelization will probably not make your inference faster, since the "deep" in DL refers to the long serial chain of processing, and that's where a hardware implementation of the optimized model makes sense.
Ideally, though, you'd want a TPU, not FPGA processors. TPUs are cheaper and you can use them for research as well.
Vegetable-Skill-9700 OP t1_j6zpscd wrote
Reply to comment by grigorij-dataplicity in Launching my first-ever open-source project and it might make your ChatGPT answers better by Vegetable-Skill-9700
Firstly, by measuring data drift and analyzing user behavior, UpTrain identifies which prompts/questions were unseen by the model or the cases where the user was unsatisfied with the model output. It automatically collects those cases for the model to retrain upon.
Secondly, you can use the package to define a custom rule and filter out relevant data sets to retrain ChatGPT for your use case.
Say you want to use LLM to write product descriptions for Nike shoes and have a database of Nike customer chats:
a) Rachel - I don't like these shoes. I want to return them. How do I do that?
b) Ross - These shoes are great! I love them. I wear them every day while practicing unagi.
c) Chandler - Are there any better shoes than Nike? 👟 😍
You probably want to filter out cases with positive sentiments or cases with lots of emojis. With UpTrain, you can easily define such rules as a python function and collect those cases.
I am working on an example highlighting how all the above can be done. It should be done in a week. Stay tuned!
Vegetable-Skill-9700 OP t1_j6zpmiy wrote
Reply to comment by uwu-dotcom in Launching my first-ever open-source project and it might make your ChatGPT answers better by Vegetable-Skill-9700
Hey, so this typically happens when there is a change in vocabulary. Just sharing my experience of facing this issue, we built a chatbot to answer product onboarding queries and with a new marketing campaign, we got a great influx of younger audience. Their questions were generally accompanied with a lot of urban slang and emojis which our NLP model wasn't equipped to handle, causing the performance to deteriorate.
earthsworld t1_j6z4mrg wrote
Reply to Any of you know a local and Open Source equivalent to Eleven Labs text to speech AI ? by lordnyrox
isn't that state of the art? didn't it just come out like a day or three ago?
what are you hoping for here?
RemindMeBot t1_j6y1i9c wrote
Reply to comment by fasdfasfsd in Any of you know a local and Open Source equivalent to Eleven Labs text to speech AI ? by lordnyrox
I will be messaging you in 2 days on 2023-02-04 18:22:00 UTC to remind you of this link
1 OTHERS CLICKED THIS LINK to send a PM to also be reminded and to reduce spam.
^(Parent commenter can ) ^(delete this message to hide from others.)
| ^(Info) | ^(Custom) | ^(Your Reminders) | ^(Feedback) |
|---|
fasdfasfsd t1_j6y1fvy wrote
Reply to Any of you know a local and Open Source equivalent to Eleven Labs text to speech AI ? by lordnyrox
remind me! 2 days
uwu-dotcom t1_j6xvy86 wrote
Reply to Launching my first-ever open-source project and it might make your ChatGPT answers better by Vegetable-Skill-9700
I've never heard of NLP model accuracy deteriorating over time, and a Google search hasn't yielded anything relevant. Is there a source about this you could point me to?
TaoTeCha t1_j6xutiw wrote
Reply to Any of you know a local and Open Source equivalent to Eleven Labs text to speech AI ? by lordnyrox
Closest you're gonna get is TorToiSe TTS, but it's really not even close. Open source solutions are far behind Eleven Labs.
scottyLogJobs t1_j6xfp9h wrote
Reply to Any of you know a local and Open Source equivalent to Eleven Labs text to speech AI ? by lordnyrox
No but I had the same thought. Boy would I love to replace my alexa with a fully-offline downloaded version of chatGPT (wondering how big a fully-trained model is, don't think it would require any of the training data to be downloaded) with Paul Bettany's voice a la Jarvis from Iron Man
BlacksmithNo4415 t1_j6x2xia wrote
Reply to comment by International_Deer27 in Loss function fluctuating by International_Deer27
i've checked for papers that do exactly what you want.
so as I assumed this data is time sensitive and therefor you need an additional temporal dimension.
this model needs to be more complex in order to solve this problem.
i suggest reading this:
https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-021-01736-y
​
BTW: have you tried grid search for finding the right hyperparametrs?
oh and your model does improve..
have you increased the data set size??
BlacksmithNo4415 t1_j6x1ugb wrote
Reply to comment by International_Deer27 in Loss function fluctuating by International_Deer27
Open-Dragonfly6825 OP t1_j73258d wrote
Reply to comment by suflaj in Why are FPGAs better than GPUs for deep learning? by Open-Dragonfly6825
Ok, that makes sense. Just wanted to confirm I understood it well.
Thank you.