Recent comments in /f/MachineLearning
dankwartrustow t1_jcj7bcp wrote
Ethics will eventually be the new Compliance. Eventually.
[deleted] t1_jcj3m0c wrote
Reply to comment by VelveteenAmbush in In your experience, are AI Ethics teams valuable/effective? [D] by namey-name-name
[removed]
yehiaserag t1_jcj305q wrote
Reply to [R] RWKV 14B ctx8192 is a zero-shot instruction-follower without finetuning, 23 token/s on 3090 after latest optimization (16G VRAM is enough, and you can stream layers to save more VRAM) by bo_peng
How does that version compare to "RWKV-4-Pile-14B-20230228-ctx4096-test663"?
rainnz t1_jcj2q3v wrote
Reply to comment by Odibbla in [D] Simple Questions Thread by AutoModerator
Thank you kind Redditor!
Odibbla t1_jcj24kc wrote
Reply to comment by rainnz in [D] Simple Questions Thread by AutoModerator
I did this when I was in Robomaster AI challenge. My solution is to use YOLOv3, which should be enough for the task you are asking for. The flow is: you mark the symbol by yourself, train YOLO step by step(all version should work actually, v3 is just my option). Take in video stream, YOLO will output the exact location of that sign in the frames. I did it on Jetson Nano and that is smooth. Since you got a degree, you shouuld be fully capable of doing this. Good luck!
Oswald_Hydrabot t1_jcizf9y wrote
Reply to comment by crazymonezyy in [P] nanoT5 - Inspired by Jonas Geiping's Cramming and Andrej Karpathy's nanoGPT, we fill the gap of a repository for pre-training T5-style "LLMs" under a limited budget in PyTorch by korec1234
"We will use it only when nothing else can solve the problem", I believe is your answer.
There are solutions that cost less than GPT-4, and they don't require integration of a black box that is gatekept by a single provider. There is a significant amount of risk in integration of a product like GPT-4 as a dependency.
[deleted] OP t1_jciywm5 wrote
ilrazziatore t1_jciyif4 wrote
Reply to comment by LeN3rd in [D] Simple Questions Thread by AutoModerator
Eh data are scarce, I have only this dataset ( it's composed by astrophysical measures, I cannot ask them to produce more data).
crazymonezyy t1_jcixab6 wrote
Reply to comment by Oswald_Hydrabot in [P] nanoT5 - Inspired by Jonas Geiping's Cramming and Andrej Karpathy's nanoGPT, we fill the gap of a repository for pre-training T5-style "LLMs" under a limited budget in PyTorch by korec1234
That's an argument much easily put forth philosophically than to a business head.
Because there's no valid answer to the follow up questions "what if we do?" and "but what if our competition offers it?".
RobbinDeBank t1_jciwzek wrote
Reply to comment by currentscurrents in [P] nanoT5 - Inspired by Jonas Geiping's Cramming and Andrej Karpathy's nanoGPT, we fill the gap of a repository for pre-training T5-style "LLMs" under a limited budget in PyTorch by korec1234
New way of measuring age
LeN3rd t1_jcitswg wrote
Reply to comment by Batteredcode in [D] Simple Questions Thread by AutoModerator
The problem with your VAE idea is, that you cannot apply the usual loss function of having the difference between the input and the output, and thous a lot of nice theoretical constraints go out of the window afaik.
https://jaan.io/what-is-variational-autoencoder-vae-tutorial/
​
I would start with a cycleGAN:
https://machinelearningmastery.com/what-is-cyclegan/
Its a little older, but i personally know it a bit better than diffusion methods.
​
With the free to use StableDiffusion model you could use it to conditionally inpaint on your image, though you would have to describe what is on that image in text. You could also train your own diffusion model, though you need a lot of training time. Not necessarily more than a GAN, but still.
It works by adding noise to an image, and then denoising it again and again. For inpainting you just do that for the regions you want to inpaint (your R and G channel), and for the regions you wanna stay the same as your original image, you just take the noise that you already know.
LeN3rd t1_jcislrk wrote
Reply to comment by DreamMidnight in [D] Simple Questions Thread by AutoModerator
I have not heard this before. Where is it from? I know that you should have more datapoints than parameters in classical models.
[deleted] OP t1_jcio9zw wrote
Reply to comment by Exarctus in [N] PyTorch 2.0: Our next generation release that is faster, more Pythonic and Dynamic as ever by [deleted]
[removed]
nat_friedman OP t1_jcijnjx wrote
Reply to comment by supreme_harmony in [N] A $250k contest to read ancient Roman papyrus scrolls with ML by nat_friedman
Well you definitely won't solve it with that attitude!
keepthepace t1_jcijjq2 wrote
Reply to comment by Hydreigon92 in In your experience, are AI Ethics teams valuable/effective? [D] by namey-name-name
> fairness metrics
Do you produce some that are differentiable ? It could be interesting to add them to a loss function
keepthepace t1_jcij20g wrote
> Do they actually add useful insight, or do they serve more as a PR thing?
The one we hear about most are pure PR.
> Is there anything you think AI ethics as a field can do to be more useful and to get more change?
Yes. Work on AI alignment. It is a broader problem than just ethics. It is also about having models generate truthful and grounded answers. I am extremely doubtful of the current trend to use RLHF for it, we need other approaches. But this is real ML development work, not just PR production. That would be an extremely useful way to steer erhicalAI efforts
Simusid OP t1_jciguq5 wrote
Reply to comment by deliciously_methodic in [Discussion] Compare OpenAI and SentenceTransformer Sentence Embeddings by Simusid
"words to numbers" is the secret sauce of all the models including the new GPT-4. Individual words are tokenized (sometimes into "word pieces") and a mapping from the tokens to numbers via a vocabulary is made. Then the model is trained on pairs of sentences A and B. Sometimes the model is shown a pair where B correctly follows A, and sometimes not. Eventually the model learns to predict what is most likely to come next.
"he went to the bank", "he made a deposit"
B probably follows A
"he went to the bank", "he bought a duck"
Does not.
That is one type of training to learn valid/invalid text. Another is "leave one out" training. In this case the input is a full sentence minus one word (typically).
"he went to the convenience store and bought a gallon of _____"
and the model should learn that the most common answer will probably be "milk"
​
Back to your first question. In 3D your first two embeddings should be closer together because they are similar. And they should be both "far' from the third encoding.
deliciously_methodic t1_jcifdxa wrote
Reply to comment by Simusid in [Discussion] Compare OpenAI and SentenceTransformer Sentence Embeddings by Simusid
Thanks very informative. Can we dumb this down further? What would a 3 dimensional embedding table look like for the following sentences? And how do we go from words to numbers, what is the algorithm?
- Bank deposit.
- Bank withdrawal.
- River bank.
supreme_harmony t1_jcieyt3 wrote
There was a recent attempt at reading hieroglyphs from temple walls in Egypt using ML, but that failed spectacularly.
Despite having tons of high quality training data available, being announced with much fanfare and ample funding in 2018, it got completely pulled by now and even its website has been erased.
I am struggling to find any results apart from some of the initial marketing material:
https://www.psycle.com/casestudy/hieroglyphics-initiative
https://www.youtube.com/watch?v=TfdWNY7priQ
I have briefly interacted with some people involved and the consensus was that its not realistically doable.
Therefore, although I do not doubt the good intention behind this prize, I am quite sceptical any results will come of it, as a seemingly simpler project with more resources failed to deliver.
TooManyDangPeople t1_jcibdsy wrote
Reply to [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
There are so many high paying jobs for ML researchers, just don't work for them. Don't support their company in any way and support the competition.
I_draw_boxes t1_jcia41b wrote
Reply to comment by ggf31416 in [D] Choosing Cloud vs local hardware for training LLMs. What's best for a small research group? by PK_thundr
A fix for the Nvidia driver is forthcoming for the P2P related issue with PyTorch DDP training. The 3090 didn't support P2P either and the bug fix won't enable P2P for the 4090, but it will correct the issue and should train much faster once fixed.
dataclinician t1_jcia118 wrote
Reply to comment by NamerNotLiteral in [N] A $250k contest to read ancient Roman papyrus scrolls with ML by nat_friedman
Lmao
AmandaBines t1_jci9bne wrote
bro just open the scroll bam two fiddy grand plz
(lol jk this is really cool i think i remember being in high school watching a doc about this and at the time they had like hardly any data)
londons_explorer t1_jcj8p9y wrote
Reply to [R] RWKV 14B ctx8192 is a zero-shot instruction-follower without finetuning, 23 token/s on 3090 after latest optimization (16G VRAM is enough, and you can stream layers to save more VRAM) by bo_peng
Can we run things like this through github.com/OpenAI/evals?
They have now got a few hundred tests, which is a good way to gauge performance.