Recent comments in /f/MachineLearning

Odibbla t1_jcj24kc wrote

I did this when I was in Robomaster AI challenge. My solution is to use YOLOv3, which should be enough for the task you are asking for. The flow is: you mark the symbol by yourself, train YOLO step by step(all version should work actually, v3 is just my option). Take in video stream, YOLO will output the exact location of that sign in the frames. I did it on Jetson Nano and that is smooth. Since you got a degree, you shouuld be fully capable of doing this. Good luck!

2

Oswald_Hydrabot t1_jcizf9y wrote

"We will use it only when nothing else can solve the problem", I believe is your answer.

There are solutions that cost less than GPT-4, and they don't require integration of a black box that is gatekept by a single provider. There is a significant amount of risk in integration of a product like GPT-4 as a dependency.

2

crazymonezyy t1_jcixab6 wrote

That's an argument much easily put forth philosophically than to a business head.

Because there's no valid answer to the follow up questions "what if we do?" and "but what if our competition offers it?".

0

LeN3rd t1_jcitswg wrote

The problem with your VAE idea is, that you cannot apply the usual loss function of having the difference between the input and the output, and thous a lot of nice theoretical constraints go out of the window afaik.

https://jaan.io/what-is-variational-autoencoder-vae-tutorial/

​

I would start with a cycleGAN:

https://machinelearningmastery.com/what-is-cyclegan/

Its a little older, but i personally know it a bit better than diffusion methods.

​

With the free to use StableDiffusion model you could use it to conditionally inpaint on your image, though you would have to describe what is on that image in text. You could also train your own diffusion model, though you need a lot of training time. Not necessarily more than a GAN, but still.

It works by adding noise to an image, and then denoising it again and again. For inpainting you just do that for the regions you want to inpaint (your R and G channel), and for the regions you wanna stay the same as your original image, you just take the noise that you already know.

1

keepthepace t1_jcij20g wrote

> Do they actually add useful insight, or do they serve more as a PR thing?

The one we hear about most are pure PR.

> Is there anything you think AI ethics as a field can do to be more useful and to get more change?

Yes. Work on AI alignment. It is a broader problem than just ethics. It is also about having models generate truthful and grounded answers. I am extremely doubtful of the current trend to use RLHF for it, we need other approaches. But this is real ML development work, not just PR production. That would be an extremely useful way to steer erhicalAI efforts

3

Simusid OP t1_jciguq5 wrote

"words to numbers" is the secret sauce of all the models including the new GPT-4. Individual words are tokenized (sometimes into "word pieces") and a mapping from the tokens to numbers via a vocabulary is made. Then the model is trained on pairs of sentences A and B. Sometimes the model is shown a pair where B correctly follows A, and sometimes not. Eventually the model learns to predict what is most likely to come next.

"he went to the bank", "he made a deposit"

B probably follows A

"he went to the bank", "he bought a duck"

Does not.

That is one type of training to learn valid/invalid text. Another is "leave one out" training. In this case the input is a full sentence minus one word (typically).

"he went to the convenience store and bought a gallon of _____"

and the model should learn that the most common answer will probably be "milk"

​

Back to your first question. In 3D your first two embeddings should be closer together because they are similar. And they should be both "far' from the third encoding.

1

supreme_harmony t1_jcieyt3 wrote

There was a recent attempt at reading hieroglyphs from temple walls in Egypt using ML, but that failed spectacularly.

Despite having tons of high quality training data available, being announced with much fanfare and ample funding in 2018, it got completely pulled by now and even its website has been erased.

I am struggling to find any results apart from some of the initial marketing material:

https://www.psycle.com/casestudy/hieroglyphics-initiative

https://www.youtube.com/watch?v=TfdWNY7priQ

I have briefly interacted with some people involved and the consensus was that its not realistically doable.

Therefore, although I do not doubt the good intention behind this prize, I am quite sceptical any results will come of it, as a seemingly simpler project with more resources failed to deliver.

3