Recent comments in /f/technology

poo2thegeek t1_j6i5xxj wrote

AI models take the art, and add it to their training inputs.

It doesn't have perfect memory of the inputs - this can be demonstrated by the fact that model sizes are significantly smaller than the size of data used to train them. Similarly, 'own perception' is an interesting idea. What does it actually mean? I'd argue than in an ML model, utilising some random input when training, to allow for different outputs for the same input (e.g, how chat GPT can reply differently even if you ask it the exact same thing on two different occasions).

I'm not saying we should treat AI models as if they're human beings - I don't think an AI model should be able to hold a copyright for example, but the company thats trained that model should be able to.

Similarly, if the AI model were to output something VERY similar to some existing work, then I think that the company that owns said AI model should be taken to court.

2

0ogaBooga t1_j6i5hne wrote

>Not sure what they can do since they're just the provider of telephony and messaging services,

They due diligence with kyc regulations in the US?

It's not hard to spot when a customer is making illegal calls. You have lots of every number they've dialed for billing purposes, cross check that against the national DNC list and if there is anything that overlaps it's the customers job to provide proof that the person they dialed agreed.

See? Easy.

1

poo2thegeek t1_j6i59po wrote

So, while this is certainly true, for something to come under copy right it had to be pretty similar to whatever its copying.

For example, if I want to write a book about wizards in the UK fighting some big bad guy, that doesn't mean I'm infringing on the copy right of Harry Potter.

Similarly, I can write a pop song that discusses, idk, how much I like girls with big asses, and that doesn't infringe on the copyright of the (hundreds) of songs on the same topic.

Now, I do think that if an AI model output something that was too similar to some of its training material, and the company that owned that said AI went ahead and published it, then yeah the company should be sued for copyright infringement.

But, it is certainly possible for AI to output completely new things. Just look at the AI art that has been generated in recent month - it's certainly making new images based off what its learnt a good image should look like.

​

Also, on top of all this, its perfectly possible to ensure (or at lest, massively decrease probability of) outputting something similar to its inputs, by 'punishing' the model if it ever outputs something too similar to training inputs.

​

All this means that I don't think this issue is anywhere near as clear cut as a lot of the internet makes it out to be.

3

imnotknow t1_j6i52gz wrote

Wow, this is really triggering people. You would think we were talking about student loan forgiveness. There is a parallel there. Like suddenly, your expensive education is not so important or exclusive or special. Your fancy title is meaningless. The years of your life spent in college? Wasted.

"But it's not really AI it's machine language!" So what? the end result is the same.

"But it doesn't really know anything!" Again, so what?

"But it makes stuff up! It lies, It's wrong a lot!" SO WHAT? So is my doctor. It doesn't have to be perfect, just better and more consistent than a human.

1

Ronny_Jotten t1_j6i3uog wrote

I don't know what paper you're referring to, but there's this one:

Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models

It clearly shows, at the top of the first page, the full Stable Diffusion model, trained on billions of LAION images, replicating images that are clearly "substantially similar" copyright violations of its training data. The paper cites several other papers regarding the ability of large models to memorize their inputs.

It may be possible to tweak the generation algorithm to no longer output such similar images, but it's clear that they are still present in the trained model network.

3

oscarhocklee t1_j6i3ohl wrote

See, that's the thing. When humans copy work, we have laws that step in and allow the owner of the work to say "No, you can't do that". Humans could copy anything they see, but there are legal consequences if they copy the wrong thing - especially if they gain financially by doing so. This is very much an argument about whether what these tools are doing is sufficiently like what a human could do for the laws that apply to humans to apply.

If copilot for instance generates code that (were a human to write it) would be legally considered (likely after a long and damaging lawsuit) to be a derived work of something licensed under the GPL, then that derived work must also legally be licensed undrr the GPL.

What's more, there is no clear authorial provenance. Say you find a github repo that contains what looks like a near-perfect copy of some code you own and which you released under a license of your choice. If a human wrote it, that's a legal issue.

Fundamentally, we're arguing here if it's okay in a situation like this to say "Oh, no, it's legal because software did it for me". And remember, there's no way to prove how much of a text file was written by a human and how much by software once it's saved.

2

Hmm_would_bang t1_j6i10j8 wrote

Humans get inspired by their own perception and imperfect memories of other artists and experiences in their life, AI models literally take the art and add it to their model.

Regardless, you seem to be proposing we treat AI models as if they are human beings and not products. We aren’t going to do that. It’s a nice philosophical game maybe, but if you just look at the facts of the matter you’re dealing with a case of a company taking unlicensed artwork and adding it into their product.

3