Recent comments in /f/technology

WTFRhino t1_j6rjhcl wrote

People are focusing on the 26% being a low catch rate. But this is deliberate in order to lower the number of false positives on human work.

The big debate is in academia and most students will not risk ruining their degree to cheat this way. A 1in4 chance is huge when there are multiple papers that go towards your degree. It just isn't worth the risk.

3

WTFRhino t1_j6rj0qk wrote

Per the article. The 26% is the chance an AI piece is labeled "very likely AI". So they can catch out over 1 in 4 pieces generated by AI. The majority of AI writing doesn't get caught, but this also means the vast majority (>99%) of non-AI work doesn't get labeled AI.

In the context of academic work. Universities are at very little risk of accusing a non-cheater of cheating. The 1in4 catch rate while low is a huge deterrent for potential cheaters. If I knew that I had a 1/4 chance of getting caught and punished, I would not cheat. Especially as i had to submit dozens of papers as part of my degree.

3

IKetoth t1_j6rfyq3 wrote

No need, I don't see a point to this, I expect given 3-5 years of adversarial training if left unregulated they'll be completely impossible to tell apart to a level where there'd be any point to it, we need to learn to adapt to the fact AI writing is poised to replace human writing in anything not requiring logical reasoning

Edit: I'd add that we need to start thinking as a species about the fact we've reached the point where human labour need not apply, there are now automatic ways to do nearly everything, the only thing stopping us is the will to use them and resources being concentrated rather than distributed, assuming plentiful resources nearly everything CAN be done without human intervention.

3

PhoneAcc2 t1_j6recky wrote

The article suggests there is a single "success" metric in OpenAIs publication, which there is not and deliberately so.

Labeling text as AI generated will always be fuzzy (active watermarking aside) and become even harder as models improve and get bigger. There is simply an area where human and AI written texts overlap.

Have a look at the FAQ on their page if you're interested in the details: https://platform.openai.com/ai-text-classifier

4

IKetoth t1_j6rc6i2 wrote

Which is a 26% success rate, how is that being misrepresented? The fact its 'somewhat confident' on other samples means nothing, if this was to be used for validating articles anywhere like writing competitions or academia you'd want that "very confident" number to be in the high 90s or at the very least the false positive amount incredibly low.

3

drawkbox t1_j6r24c5 wrote

This needs to be from third parties otherwise those who control the neural nets and datasets will be able to shroud information as "not generated" when it is clearly astroturfing or manipulated. Then they can throw their hands up and say "must be the algorithm or a bad dataset" for plausible deniability.

The new game is coming or here, and it is misdirecting blame to the "algorithm" when it is an editorialized set of data or filtered for certain aims.

Almost all algorithms and datasets are biased or editorialized in some way, laws need to be adjusted on that. You can't blame the "algorithm" for enragement, because enragement is engagement.

1