Recent comments in /f/MachineLearning

WarmSignificance1 t1_jdhrkof wrote

Reply to comment by race2tb in [N] ChatGPT plugins by Singularian2501

Well now you’re conflating two different things. A unified experience is always good. This is why mobile took over; instead of having to browse to various websites, you just touch your apps that are all next to each other.

Natural language seems highly inefficient for lots of things. I don’t want to type to my bank. I want to open up an app/website and click a button to make a transfer.

3

endless_sea_of_stars t1_jdhrar6 wrote

Reply to comment by Izzhov in [N] ChatGPT plugins by Singularian2501

Sort of. The default retrieval plug-in is more of a database lookup. It converts a question into a word vector (via Ada api) and uses that to query a self hosted vector database. The base version is more for question/answer scenarios.

That being said, I'm sure that someone is already working on novel generator plug-in that would be more tailored to your use case.

1

nicku_a OP t1_jdhra2m wrote

Sure! Traditionally, hyperparameter optimization (HPO) for reinforcement learning (RL) is particularly difficult when compared to other types of machine learning. This is for several reasons, including the relative sample inefficiency of RL and its sensitivity to hyperparameters.
AgileRL is initially focused on improving HPO for RL in order to allow faster development with robust training. Evolutionary algorithms have been shown to allow faster, automatic convergence to optimal hyperparameters than other HPO methods by taking advantage of shared memory between a population of agents acting in identical environments.
At regular intervals, after learning from shared experiences, a population of agents can be evaluated in an environment. Through tournament selection, the best agents are selected to survive until the next generation, and their offspring are mutated to further explore the hyperparameter space. Eventually, the optimal hyperparameters for learning in a given environment can be reached in significantly less steps than are required using other HPO methods.

26

TFenrir t1_jdhqnb3 wrote

Are you working with the gpt4 api yet? I'm still working with 3.5-turbo so it isn't toooo crazy during dev, but I'm about to write a new custom agent that will be my first attempt at a few different improvements to my previous implementations - one of them namely is trying to use different models for different parts of the chain, conditionally. Eg - I want to experiment with using 3.5 for some mundane infernal scratch pad work, but switch to 4 if the confidence of the agent in success is low - that sort of thing.

I'm hoping I can have some success, but at the very least the pain will be educational.

1

race2tb t1_jdhq01x wrote

Not up to the business, it is up to the user. Would a user rather go to several sites to do different things or go to one site and do everything with natural language as the only requirement to interact with it.

3

MrEloi t1_jdhovux wrote

Well, I have asked it to design/invent three DIY tools that I needed.

One was stupid : it needed a microprocessor etc.

Another was unique - but too close to existing products to offer benefit.

However, one was novel and probably marketable - and took only 20 minutes in a chat with GPT-4 to finalize the design.

Just think how life will be when this catches on : everyone with an imagination and access to a 3D printer will be building all sorts of weird things.

5

agent_zoso t1_jdhovl9 wrote

Furthermore, if we are to assume that an LLM can be boiled down to nothing more than a statistical word probability engine because that's what its goal is (which is dubious for the same reason we don't think of people with jobs as being only defined as payraise probability engines, what if a client asks a salesman important questions unrelated to the salesman's goal, etc.), this point of view is self-destructive and completely incoherent when you factor in that for ChatGPT in particular, it's also trained using RLHF ("Reinforcement Learning with Human Feedback").

Everytime you leave a Like/Dislike (or take the time to write out longer feedback) on one of ChatGPT's messages, that gets used directly by ChatGPT to train the model through a never-ending process of (simulated) evolution through model competition with permutations of itself. So there are two things to note here, A. It's goals include not only maximizing log-likelihoods of word sequences but also in inferring new goals from whatever vague feedback you've provided it, and B. How can anyone be so sure that such a system couldn't develop sophisticated complexity like sentience or consciousness like humans did through evolution (especially when such a system is capable of creating its own goals/heuristics and we aren't sure how many layers of abstraction with which it's recursively doing so)?

On that second point in particular, we just don't currently have the philosophical tools to make any sort of statements about that, but people are sticking to hard-and-fast black and white statements of the kind we made about even other humans until recent history. We as humans love to have hard answers about others' opinions so I see the motivation for wanting to tamp down the tendency to infer emotion from ChatGPT's responses, but this camp has gone full swing in the other direction with unscientific and self-inconsistent arguments because they've read a buzzfeed or verge article produced by people with skin in the game (long/short msft, it's in everyone's retirement account too).

I think the best reply in general to someone taking the paperclip-maximizer stance while claiming to know better than everyone else the intricacies of an LLM's latent representations of concepts encoded through the linear algebraic matrix multiplication in the V space, the eigenvector (Q,K) embeddings from PCA or BERT-like systems, or embedded in its separate neuromorphic structure ("it's just autocorrect, bro") is to draw the same analogy that they're just a human meat-puppet designed to maximize dopamine and therefore merely a mechanical automaton slave to biological impulses. Obviously this reductionism is in general a fallacious way of rationalizing things (something we "forget" time and again throughout history because this time it's different), but you also can't counter by outright stating that ChatGPT is sentient/conscious/whatever, we don't know for sure whether that's even possible (cf. Chinese room -against, David Chalmers' Brain of Theseus -for, Penrose's contentious Gödelian construction demonstrating human supremacy as Turing machine halt checkers -against).

8

Izzhov t1_jdhnapr wrote

> You can have it query your data without paying for fine-tuning.

Total noob here, so forgive me if this question is dumb or naive. I'm interested in pursuing collaborative fiction writing with AIs. Does what you're saying here imply that, in principle, I can sort of artificially increase ChatGPT's memory of whatever world I'm working with it to write about, by developing a plug-in which queries info about my story that I've written including character info, setting details, and previous chapters? If true, this would help the whole process immensely...

1

stimulatedecho t1_jdhlv4w wrote

>> nobody with a basic understanding of how transformers work should give room to this

I find this take to be incredibly naive. We know that incredible (and very likely fundamentally unpredictable) complexity can arise from simple computational rules. We have no idea how the gap is bridged from a neuron to the human mind, but here we are.

>> There is no element of critique and no element of creativity. There is no theory of mind, there is just a reproduction of what people said, when prompted regarding how other people feel.

Neither you, nor anybody else has any idea what is going on, and all the statements of certainty leave me shaking my head.

The only thing we know for certain is that the behavioral complexity of these models is starting to increase almost exponentially. We have no idea what the associated internal states may or may not represent.

60