Recent comments in /f/MachineLearning

MjrK t1_jdjqz9h wrote

> We emphasize that Alpaca is intended only for academic research and any commercial use is prohibited. There are three factors in this decision: First, Alpaca is based on LLaMA, which has a non-commercial license, so we necessarily inherit this decision. Second, the instruction data is based on OpenAI’s text-davinci-003, whose terms of use prohibit developing models that compete with OpenAI. Finally, we have not designed adequate safety measures, so Alpaca is not ready to be deployed for general use.

https://crfm.stanford.edu/2023/03/13/alpaca.html

22

simmol t1_jdjq815 wrote

I think for this to be truly effective, the LLM would need to take in huge amounts of computer screen images in its training set, and I am not sure if that was done for the pre-trained model for GPT-4. But once this is done for all possible computer screen image combinations that one can think of, then it would probably be akin to the self-driving car type of algorithm where you can navigate accordingly based on the images.

But this type of multi-modality would be useful if you have the person actually sitting in front of the computer working side-by-side with the AI, right? Because if you want to eliminate the human from the loop, then I am not sure if this is an efficient way of training the LLM since these type of computer screen images are what helps a human navigate the computer, and not necessarily optimal for the LLM.

1

Colecoman1982 t1_jdjkgjy wrote

When you asked, did you clarify that you were asking about the training data versus the whole project? The final Alpaca project was built, in part, on top of Meta's LLaMa. Since LLaMa has a strictly non-commercial license, there is no way that Stanford can ever release their final project for commercial use (as they've already stated in their initial release of the project). On the other hand, any training data they've created on their own (without needing any code from LLaMa) should be within their power to re-license. If they think you are asking for the whole project to be re-licenced, they are likely to just ignore your request.

23

TikiTDO t1_jdjibnv wrote

My point was that you could pass all the information contained in an embedding as a text prompts into a model, rather than using it directly as an input vector, and an LLM could probably figure out how to use it even if the way you chose to deliver those embeddings was doing a numpy.savetxt and then sending the resulting string is as a prompt. I also pointed out that you could if your really wanted to write a network to convert an embedding to some sort of semantically meaningful word soup that stores the same amount of information. It's basically a pointless bit of trivia which illustrates a fun idea.

I'm not particularly interested in arguing whatever you think I want to argue. I made a pedantic aside that technically you can represent the same information in different formats, including representing embedding as text, and that a transformer based architecture would be able to find patterns it it all the same. I don't see anything to argue here, it's just a "you could also do it this way, isn't that neat." It's sort of the nature of a public forum; you made a post that made me think something, so I hit reply and wrote down my thoughts, nothing more.

2

Maleficent_Refuse_11 t1_jdji46u wrote

How do you quantify that? Is it the downdoots? Or is it the degree to which I'm willing to waste my time discussing shallow inputs? Or do you go by gut feeling?

On a serious note: Have you heard of brandolinis law? The asymmetry described there has been shifted by several magnitudes with generative "ai". Unless we are going to start to use the same models to argue with people (e.g. chatbot output) on the net, we will have to choose much more carefully what discussions we involve ourselves on, don't you think?

−5

Prometheushunter2 t1_jdjhln9 wrote

Here’s an oddly specific question: a few years ago I read about a neural network that could both classify an image and, if ran in reverse, could generate synthetic examples of the classes it has learned. Th e problem is I’ve forgotten the name and it’s been haunting me lately, so I ask does anyone know what kind of neural network this might be?

1