Recent comments in /f/MachineLearning
sleeplessinseattle00 t1_jdi6hot wrote
Reply to comment by paulgavrikov in [D] ICML 2023 Reviewer-Author Discussion by zy415
We’ve written a gentle note, but no reply to that. Literally ran 50 new experiments as suggested
to4life4 OP t1_jdi6cqx wrote
Reply to comment by Latter-Personality-6 in [D] What is the best open source chatbot AI to do transfer learning on? by to4life4
How does performance compare to SOTA?
Disastrous_Elk_6375 t1_jdi6779 wrote
Reply to comment by dankaiv in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
nanoSingularity goes brrrrr
light24bulbs t1_jdi5qau wrote
Reply to comment by TFenrir in [N] ChatGPT plugins by Singularian2501
That's what I'm doing. Using 3.5 to take big documents and search them for answers, and then 4 to do the overall reasoning.
It's very possible. You can have gpt4 writing prompts to gpt 3.5 telling it to do things
liyanjia92 OP t1_jdi3dvw wrote
Reply to comment by KingsmanVince in [P] ChatGPT with GPT-2: A minimum example of aligning language models with RLHF similar to ChatGPT by liyanjia92
Thanks!
mycall t1_jdi3cko wrote
Reply to [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
Can it detect object in the photo? Maybe drive an RC car with it? :)
[deleted] t1_jdi2a3y wrote
entitledypipo t1_jdi1p2x wrote
JimiSlew3 t1_jdi0izc wrote
Reply to comment by LeN3rd in [D] Simple Questions Thread by AutoModerator
Thanks. I'm curious once we get it to do things. Like, tell it to analyze a giant dataset, and produce a visual of interesting stuff. Some tools I use will offer suggestions and I'm thinking the link between asking a question and getting information will be significantly shortened and wanted to know if anyone had done that yet.
pmirallesr t1_jdi0eqg wrote
Reply to comment by anothererrta in [D] "Sparks of Artificial General Intelligence: Early experiments with GPT-4" contained unredacted comments by QQII
With these people, it's interesting to ask, how do we know human intelect is not.emergent behaviour of a simple task. That would correspond to a radical view of predictive coding. I'm no expert in neuroscience, but to me, the idea that AGI cannot arise from a single simple task makes less and less sense as time goes by
StellaAthena t1_jdi094w wrote
Reply to comment by ILOVETOCONBANDITS in [D] ICML 2023 Reviewer-Author Discussion by zy415
I just posted in response to each reviewer:
> Thank you for taking the time to review our work. We have carefully considered your comments and have provided a thorough rebuttal addressing your concerns. If you feel that your comments have been adequately addressed, we would greatly appreciate it if you could update your score to reflect that. We are also more than happy to continue this conversation over the next few days until the March 26th deadline.
I submitted several papers, all of which got borderline scores (average between 4.3 and 5.3), though one got 7 / 7 / 2 (yikes!). I had been hopeful that a strong rebuttal could judge one of them over the line, but the longer it goes without any response or updates the more discouraged I get.
zy415 OP t1_jdhzcrf wrote
Reply to comment by ILOVETOCONBANDITS in [D] ICML 2023 Reviewer-Author Discussion by zy415
Definite worth reminding the reviewers to respond
nicku_a OP t1_jdhyy1d wrote
Reply to comment by boyetosekuji in [P] Reinforcement learning evolutionary hyperparameter optimization - 10x speed up by nicku_a
hahaha I love it
nicku_a OP t1_jdhyqr6 wrote
Reply to comment by Peantoo in [P] Reinforcement learning evolutionary hyperparameter optimization - 10x speed up by nicku_a
You can help! Please join the discord and get involved, we’d love to have you
boyetosekuji t1_jdhyeok wrote
Reply to comment by nicku_a in [P] Reinforcement learning evolutionary hyperparameter optimization - 10x speed up by nicku_a
ChatGpt: Okay, let me try to explain this using gaming terminology!
Imagine you're playing a game where you have to learn how to do something new, like defeat a tough boss. You have different settings or options (hyperparameters) to choose from, like which weapons or abilities to use, how aggressive or defensive to play, etc.
Now, imagine that this boss is really tough to beat and you don't have many chances to practice. So, you want to find the best combination of options as quickly as possible, without wasting too much time on trial and error. This is where hyperparameter optimization (HPO) comes in.
HPO is like trying out different settings or options until you find the best ones for your playstyle and the boss's behavior. However, in some games (like Dark Souls), it's harder to do this because you don't have many chances to try out different combinations before you die and have to start over. This is similar to reinforcement learning (RL), which is a type of machine learning that learns by trial and error, but it's not very sample efficient.
AgileRL is like having a bunch of other players (agents) who are also trying to defeat the same boss as you. After a while, the best players (agents) are chosen to continue playing, and their "offspring" (new combinations of settings or options) are mutated and tested to see if they work better. This keeps going until the best possible combination of settings or options is found to beat the boss in the fewest possible attempts. Using AgileRL is much faster than other ways of doing HPO for RL, which is like having a lot of other players to help you find the best strategy for defeating the boss.
bernaferrari t1_jdhy0qz wrote
Reply to comment by RedditLovingSun in [N] ChatGPT plugins by Singularian2501
Good news is, deep learning APIs are decoupled from android, so Google can just update via play store (as long as the device gpu supports it).
__ingeniare__ t1_jdhxcds wrote
Reply to comment by BinarySplit in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
I would think image segmentation for UI to identify clickable elements and the like is a very solvable task
trnka t1_jdhvzy3 wrote
Reply to comment by lightyagami03 in [D] Simple Questions Thread by AutoModerator
Eh, we've gone through a lot of hype cycles before and the field still exists. For example, deep learning was hyped to replace all feature engineering for all problems and then NLP would be trivialized. In practice, that was overhyped and you still need to understand NLP to get value out of deep learning for NLP. And in practice, there's still quite a bit of feature engineering (and practices like it).
I think LLMs will turn out to be similar. They'll change the way we approach many problems, but you'll still need to understand both LLMs and more problem-specific aspects of ML.
Back to your question, if you enjoy AI/ML and you're worried about jobs in a few years, I think it's still worth pursuing your interests.
If anything, the bigger challenge in jobs in the next year or two is the current job market.
ThirdMover t1_jdhvx8i wrote
Reply to comment by BinarySplit in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
>GPT-4 is potentially missing a vital feature to take this one step further: Visual Grounding - the ability to say where inside an image a specific element is, e.g. if the model wants to click a button, what X,Y position on the screen does that translate to?
You could just ask it to move a cursor around until it's on the specified element. I'd be shocked if GPT-4 couldn't do that.
Peantoo t1_jdhuqsq wrote
Reply to comment by nicku_a in [P] Reinforcement learning evolutionary hyperparameter optimization - 10x speed up by nicku_a
Love it. I tried to come up with something like this myself but never found the time or extra help I'd need to implement it. Glad to see someone has done all the hard work!
loopuleasa t1_jdhuit0 wrote
Reply to comment by farmingvillein in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
talking about consumer access to the image API
is tricky, as the system is swamped already with text
they mentioned an image takes 30 seconds to "comprehend" by the model...
farmingvillein t1_jdhua51 wrote
Reply to comment by loopuleasa in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
Hmm, what do you mean by "publicly"? OpenAI has publicly stated that GPT-4 is multi-modal, and that they simply haven't exposed the image API yet.
The image API isn't publicly available yet, but it is clearly coming.
race2tb t1_jdhtzzm wrote
Reply to comment by WarmSignificance1 in [N] ChatGPT plugins by Singularian2501
You can crosstalk information and functionality in the version of the future I am talking about. Moating in different apps is going to seem unappealing. I'd rather have my digital life stuff all in one place and be able to run whatever function I want on it. This can be done with microservices handling that in the background. I can even create a function that doesn't exist in natural language.
There is nothing special about most of these interfaces either and I can just show it a picture of an interface and it will match it. I can draw it on a napkin if I want =).
Soc13In t1_jdhtpvf wrote
Reply to comment by godaspeg in [N] ChatGPT plugins by Singularian2501
thank you.
to4life4 OP t1_jdi6ntl wrote
Reply to comment by NoBoysenberry9711 in [D] What is the best open source chatbot AI to do transfer learning on? by to4life4
Yeah that is sort of the idea, to focus attention (heh) on a specific input, but after being a general pretrained chatbot.