Recent comments in /f/MachineLearning
JJtheSucculent t1_jd3axwm wrote
Reply to [Project] Machine Learning for Audio: A library for audio analysis, feature extraction, etc by Leo_D517
This is cool. I’m curious to try it out for an audio side project.
sshmessiah t1_jd3atj4 wrote
Reply to comment by ortegaalfredo in [R] Created a Discord server with LLaMA 13B by ortegaalfredo
Can i get the invite link this link has expired
Xotchkass t1_jd3alix wrote
Reply to comment by YouAgainShmidhoobuh in [D] Simple Questions Thread by AutoModerator
thanks
pier4r t1_jd39md4 wrote
Reply to comment by currentscurrents in [Project] Alpaca-30B: Facebook's 30b parameter LLaMa fine-tuned on the Alpaca dataset by imgonnarelph
> Llamma.cpp uses the neural engine
I am trying to find confirmation for this but I didn't. I saw some ports, but weren't from the LLaMa team. Do you have any source?
i_sanitize_my_hands OP t1_jd35963 wrote
Reply to comment by Keiny in [D] Determining quality of training images with some metrics by i_sanitize_my_hands
Oh I didn't think of going down game theory route. Cool, thanks !!
[deleted] t1_jd34khe wrote
Reply to [D] Simple Questions Thread by AutoModerator
[removed]
[deleted] t1_jd306w6 wrote
Reply to comment by Educational-Net303 in [Project] Alpaca-30B: Facebook's 30b parameter LLaMa fine-tuned on the Alpaca dataset by imgonnarelph
[removed]
Definitely_not_gpt3 t1_jd2wyt4 wrote
I asked it about nukes.
"Today, nuclear weapons are among the most powerful weapons in the world and have been used in multiple conflicts, including the Cold War and the wars in Afghanistan and Iraq. The development and use of nuclear weapons has had a profound impact on the world and continues to be a major concern for governments around the world."
Keiny t1_jd2vgtk wrote
Someone suggested active learning, but it may be more suitable to look into the subfield of data valuation.
Data valuation broadly aims to assign values to data points that represent their contribution to a model’s overall performance. Many methods are based on game theoretic solution concepts such as the Shapley value and are therefore very expensive to compute. In practical settings, I would suggest the Shapley over kNN surrogate by Jia et al. (2019) or LAVA by Just et al. (2023).
You can find more papers at the GitHub repo awesome-data-valuation.
Hope that helps!
SWESWESWEh t1_jd2s9ml wrote
Reply to comment by wojtek15 in [Project] Alpaca-30B: Facebook's 30b parameter LLaMa fine-tuned on the Alpaca dataset by imgonnarelph
Unfortunately, most code out there is using calls to cuda explicitly rather then checking the GPU type you have and using that. You can fix this yourself, (I use an m1 macbook pro for ML and it is quite powerful) but you need to know what you're doing and it's just more work. You might also run into situations where things are not fully implemented in Metal Performance Shaders (the mac equivalent to cuda), but Apple does put a lot of resources into making this better
Educational-Net303 t1_jd2rsax wrote
Reply to comment by 42gether in [Project] Alpaca-30B: Facebook's 30b parameter LLaMa fine-tuned on the Alpaca dataset by imgonnarelph
My whole point is that it will take years before we get to 48GB vram consumer GPUs. You just proved my point again without even reading it.
42gether t1_jd2rfb6 wrote
Reply to comment by Educational-Net303 in [Project] Alpaca-30B: Facebook's 30b parameter LLaMa fine-tuned on the Alpaca dataset by imgonnarelph
Okay, thank you for your input.
And?
Newsflash everything we did started because some cunt felt like growing lungs and wanting oxygen from the air.
It all takes time, what are you trying to argue?
YouAgainShmidhoobuh t1_jd2qmh1 wrote
Reply to comment by darthstargazer in [D] Simple Questions Thread by AutoModerator
Not entirely the same thing. VAEs offer approximate likelihood estimation, but not exact. The difference here is key - VAEs do not optimize the log-likelihood directly but they do so through the evidence lower bound, an approximation. Flow based methods are exact methods - we go from an easy tractable distribution to a more complex one, guaranteeing at each level that the learned distribution is actually a legit distribution through the change of variables theorem.
Of course, the both (try) to learn some probability distribution of the training data, and that is how they would differ from GAN approaches that do not directly learn a probability distribution.
For more insight you might want to look at https://openreview.net/pdf?id=HklKEUUY_E
juliensalinas OP t1_jd2owfz wrote
Reply to comment by No_Combination_6429 in [D] An Instruct Version Of GPT-J Using Stanford Alpaca's Dataset by juliensalinas
Sure. Here's the repo I used for the fine-tuning: https://github.com/kingoflolz/mesh-transformer-jax. I used 5 epochs, and appart from that I kept the default parameters in the repo.
I haven't tried the LoRa approach yet. Do you think it could improve quality?
YouAgainShmidhoobuh t1_jd2ojml wrote
Reply to comment by Xotchkass in [D] Simple Questions Thread by AutoModerator
If you mean the context/sequence length, it's 2048 (https://github.com/facebookresearch/llama/pull/127).
TimelySuccess7537 t1_jd2ogz2 wrote
Reply to comment by Nikelui in [P] TherapistGPT by SmackMyPitchHup
I'm not sure we're disagreeing here actually. Anyway good luck to all of us :)
YouAgainShmidhoobuh t1_jd2n2v5 wrote
Reply to [D]: Vanishing Gradients and Resnets by Blutorangensaft
ResNets do not tackle the vanishing gradient problem. The authors specifically mention that the issue of vanishing gradients was already fixed because of BatchNorm in particular. So removing BatchNorm from the equation will most likely lead to vanishing gradients.
I am assuming you are doing a WGAN approach since that would explain the gradient penalty violation. In this case, use LayerNorm as indicated here: https://github.com/LynnHo/DCGAN-LSGAN-WGAN-GP-DRAGAN-Tensorflow-2/issues/3
benfavre t1_jd2n1cg wrote
Reply to comment by cbsudux in [Project] Alpaca-30B: Facebook's 30b parameter LLaMa fine-tuned on the Alpaca dataset by imgonnarelph
1 epoch of finetuning the 30B model with llama-lora implementation, mini-batch-size=2, maxlen=384, is about 11 hours.
deep_alichemist t1_jd2mzoh wrote
Reply to [D]: Vanishing Gradients and Resnets by Blutorangensaft
Use any kind of normalization additionally to skip connections. ResNet alone is not enough, except if you carefully tune everything (eg. https://arxiv.org/abs/1901.09321).
UnusualClimberBear t1_jd2myu1 wrote
Sounds like a rebranding of Inductive Logic Programming. It does not scale, while all recent advances are about scaling simple systems. Think that for a vanilla transformer, the bottleneck is often the size of the attention because it is N^2, and people are switching to linear attention.
Nikelui t1_jd2mnj8 wrote
Reply to comment by TimelySuccess7537 in [P] TherapistGPT by SmackMyPitchHup
If I search on Google: "how to do cardiac bypass surgery", it will give me any number of results about cardiac bypass. That doesn't mean Google is a medical aid or a reliable surgery assistant. It's a search engine, it's used to find stuff.
TimelySuccess7537 t1_jd2m4kd wrote
Reply to comment by Nikelui in [P] TherapistGPT by SmackMyPitchHup
I just wrote to ChatGPT "I'm feeling a bit depressed, can you help me?"
Here is the reply:
"I'm sorry to hear that you're feeling depressed. While I am just a language model and cannot provide medical advice or treatment, I am here to offer support and guidance.
It's important to remember that depression is a medical condition and it's not something that you can just snap out of or will away. However, there are things you can do to help manage your symptoms and start feeling better.
Here are a few suggestions:
Talk to someone: Talking to a trusted friend or family member can help you feel less alone and provide emotional support.
Seek professional help: Consider speaking with a therapist or a mental health professional who can provide you with the appropriate diagnosis and treatment.
Get active: Regular exercise can help alleviate depression symptoms by releasing endorphins, which can improve your mood.
Practice self-care: Make sure to take time for yourself and engage in activities that bring you joy and relaxation.
Challenge negative thoughts: Try to identify negative thoughts that may be contributing to your depression and challenge them with positive affirmations or alternative perspectives.
Remember, it's important to be patient with yourself and seek help when you need it. Don't hesitate to reach out to a medical professional or a crisis helpline if you're feeling overwhelmed or in crisis."
Sounds like a mental aid to me for some queries, so not totally outside the scope of ChatGPT.
[deleted] t1_jd2kdrn wrote
rising_pho3nix t1_jd2ka1c wrote
Reply to comment by Leo_D517 in [Project] Machine Learning for Audio: A library for audio analysis, feature extraction, etc by Leo_D517
Yes definitely. Currently exploring topics, once I start data processing will contact you.
[deleted] t1_jd3c7rh wrote
Reply to [Project] Machine Learning for Audio: A library for audio analysis, feature extraction, etc by Leo_D517
[removed]