Recent comments in /f/MachineLearning
[deleted] t1_je28hhn wrote
[deleted] t1_je26wrk wrote
Reply to comment by andreichiffa in [D] Small language model suitable for personal-scale pre-training research? by kkimdev
[removed]
[deleted] t1_je23vry wrote
geekfolk t1_je23p7c wrote
How is it better than GANs though? or in other words, what's so bad about adversarial training? modern GANs (with zero centered gradient penalties) are pretty easy to train.
shayanrc t1_je23b8s wrote
Reply to [D] With ML tools progressing so fast, what are some ways you've taken advantage of them personally? by RedditLovingSun
The best uses I've come up with for the api:
- annotating large text data sets
- checking the work of human labelers
- using it to generate the rest API that hasn't been built yet
LanchestersLaw t1_je22xzv wrote
Reply to [D] FOMO on the rapid pace of LLMs by 00001746
Something ive seen a lot of on reddit which you can get a slice of is now that GPT is out, “let me build an app that has GPT do this thing automatically” with varying degrees of success from dating bot to medical diagnosis tools
TeH_Venom t1_je21d7u wrote
Not quite cross model architecture, but it's not impossible to merge different fine tunes of a model into one.
I personally have a few scripts for a few strategies such as
- Average merge;
- Diff merge;
- Block merging. (link)
I haven't tested diff merging or block merges too much (me and a friend finished adapting SD's block merge to LMs last week) but weighted average merges are a pretty safe way of mixing models.
[deleted] t1_je1zkkw wrote
Reply to comment by JigglyWiener in [D] With ML tools progressing so fast, what are some ways you've taken advantage of them personally? by RedditLovingSun
[removed]
Thorusss t1_je1z0ib wrote
Reply to comment by jrkirby in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
>Then they spent 20K$+ compute on training.
Your estimate is a few magnitudes too low
VarietyElderberry t1_je1wlnn wrote
Reply to comment by currentscurrents in "[D]" Is wandb.ai worth using? by frodo_mavinchotil
Or MLFlow.
[deleted] t1_je1wani wrote
shitasspetfuckers t1_je1v7pf wrote
Reply to comment by reditum in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
Can you please clarify what specifically about their approach wasn't great?
suflaj t1_je1uvo8 wrote
They probably redid the experiments themselves. Also, ResNets had some changes shortly after release I believe, and they could have used different pretraining weights. AFAIK He et al. never released their weights.
Furthermore, Wolfram and PyTorch pretrained weights are also around 22% top-1 error rate, so that is probably the correct error rate. Since PyTorch provides weights that reach 18% top-1 error rate with some small adjustments to the training procedure, it is possible the authors got lucky with the hyperparameters, or employed some techniques they didn't describe in the paper.
gunbladezero t1_je1u5ev wrote
Reply to comment by sebzim4500 in [P] two copies of gpt-3.5 (one playing as the oracle, and another as the guesser) performs poorly on the game of 20 Questions (68/1823). by evanthebouncy
Very interesting, thank you! I hadn't thought of that- it has to translate it for every token, you say, not just every answer? I wonder if it would work better or worse asking it to encode it in arabic, or chinese etc. Of course, it would be simple to script something to hide the answer from the player without revealing it. I do know that if it doesn't store the answer, it will completely invent one every with each question...
edit: It does work better with plaintext. Not sure if I would have guessed her but it answered the questions correctly this time.
ninjawick t1_je1tz72 wrote
Reply to comment by Beautiful-Gur-9456 in [P] Consistency: Diffusion in a Single Forward Pass 🚀 by Beautiful-Gur-9456
That's great. Really like to see speed and prompt difference at standard 20 steps
thelastpizzaslice t1_je1pphc wrote
Reply to comment by nixed9 in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
I asked it to write another one from Darth Maul's perspective after that and it did a ducking amazing job.
WarmSignificance1 t1_je1pdz9 wrote
Reply to comment by cegras in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
I think that ChatGPT has shown how bad so many people are at Googling. And granted, sometimes ChatGPT is just far superior.
But when people say things like "I can ask it how to use a library and it's made me 10x faster over using Google", it just blows my mind. I can usually find the official docs and figure out how to use a library in about the same time as ChatGPT can tell me, without the risk of errors.
mlresearchoor t1_je1mvf7 wrote
Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
OpenAI blatantly ignored the norm to not train on the ~200 tasks collaboratively prepared by the community for BIG-bench. GPT-4 knows the BIG-bench canary ID afaik, which removes the validity of GPT-4 eval on BIG-bench.
OpenAI is cool, but they genuinely don't care about academic research standards or benchmarks carefully created over years by other folks.
HonkyTonkPolicyWonk t1_je1mqdp wrote
Reply to [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-
Well, yeah, ChatGTP is auto-suggest on steroids. It can’t create anything de novo. It reframes and regurgitates what others have done.
No surprises here
[deleted] t1_je28myp wrote
Reply to My ChatGPT Chrome Extension that saves conversations in .md files is finally approved by the Chrome Web Store. It's still and so will continue to be Open Source. [P] by ThePogromist
[removed]