Recent comments in /f/deeplearning

Oreoed t1_j8yfckw wrote

Depends on what you're trying to do with this project.
If you already have a news source, implementing pre-trained model from huggingface should be relatively easy.
If you want to fine-tune that model, you will need a dataset of news headlines.
Check out Kaggle, there are some small but publicly available datasets.
I know you can also find data on some obscure github repo, but good luck with that.
If your goal is to implement a fully operational pipeline, you will need not only all the above, but also a way to acquire news in real time. That may mean a scrapper of some news outlets that are of interest. Once again, github is your friend.
That said, don't expect a profit off this alone. Using news data alongside some trading indicators will *maybe* work on paper (ie. backtest) with the right features and optimization, but is unlikely to get live results.
Then again, for a college project that might not be relevant.

1

suflaj t1_j8xv8md wrote

Maybe that's just a necessary step to cull those who cannot critically think and to force society to stop taking things for granted.

At the end of the day misinformation like this was already generated and shared even before ChatGPT was released, yet it seems that the governmental response was to allow domestic sources and try to hunt down foreign ones. So if the government doesn't care, why should you?

People generally don't seem to care even if the misinformation is human-generated, ex. Hunter Biden's laptop story. I wouldn't lose sleep over this either way, it is still human-operated.

2

zcwang0702 OP t1_j8xufwt wrote

>if it isn't of roughly the same size or larger. Either ChatGPT is REALLY sparse, or such detection models won't be available to mortals. So far, it doesn't seem to be sparse, since similarly sized detectors can't reliably diffe

Yeah I have to say this is scary, cause if we cannot build a robust detector now, it will become increasingly difficult to do in the future. LLM will make the Internet information continuously blurred.

1

suflaj t1_j8xor46 wrote

It doesn't matter if it isn't of roughly the same size or larger. Either ChatGPT is REALLY sparse, or such detection models won't be available to mortals. So far, it doesn't seem to be sparse, since similarly sized detectors can't reliably differentiate between it and human text.

1

[deleted] OP t1_j8x413i wrote

I did it with tiny yolo v7, using google colab. My point is that there are barely any projects that are usable, unless you found some?

Yes the results were great, I am thinking of writing a little blogpost for others, it is actually quite simple because I found a tutorial in roboflow this time around.

Thanks for your support !

2

Aynit t1_j8wytok wrote

I'd recommend scooting on over to hf.co and checking out some of their open source models. Realistically you can call one, or an ensemble from there in a few lines of code. Not sure what's causing this piece of work - school? work? hobby? But that's a decent place to get something set up quickly. They might even have models fine tuned on stock/economic vocabulary.

1