Recent comments in /f/MachineLearning

Dartagnjan t1_jdzo44q wrote

Is anyone in need of machine learning protégé? I am looking for a doctorate position in the German and English speaking worlds.

My experience is in deep learning, specifically GNNs applied to science problems. I would like to remain in deep learning, broadly but would not mind changing topic to some other application, or to a more theoretical research project.

I am also interested in theoretical questions, e.g. given a well defined problem (e.g. the approximation of the solution of a PDE), what can we say about the "training difficulty", is optimization at all possible (re. Tangent kernel analysis), how architectures help facilitate optimization, and solid mathematical foundations of deep learning theory.

I have a strong mathematical background with knowledge in functional analysis and differential geometry, and also hold a BSc in Physics, adjacent to my main mathematical educational track.

Last week I also started getting into QML with pennylane and find the area also quite interesting.

Please get in touch if you think I could be a good fit for your research group or know an open position that might fit my profile.

2

bjj_starter t1_jdzo3zq wrote

But that's pure speculation. They showed that a problem existed with training data, and OpenAI had already dealt with that problem and wasn't hiding it at all - GPT-4 wasn't tested on any of that data. Moreover, it's perfectly fine for problems like the ones it will be tested on to be in the training data, as in past problems. What's important is that what it's actually tested on is not in the training data. There is no evidence that it was tested on training data, at this point.

Moreover, the Microsoft Research team was able to repeat some impressive results in a similar domain on tests that didn't exist before the training data cut-off. There isn't any evidence that this is a problem with a widespread effect on performance. It's also worth noting that it seems pretty personal for the guy behind this paper, judging by the way he wrote his tweet.

3

AuspiciousApple t1_jdznf40 wrote

Sorry, that was not very clearly explained on my part.

Do you understand that these models have weights/parameters - numbers that define their behaviour? The standard sense of "learning" in ML is to update these weights to fit some training data better.

And are you aware that large language model get a sequence of text (the "context") and predict the next bit of text from that? Now, these models can use examples in the text they are given to do things they otherwise wouldn't be able to. This is called in-context learning. However, here the parameters of the model don't change and if the examples aren't in the context, then the model doesn't remember anything about it.

1

sebzim4500 t1_jdzmpee wrote

Wordle is kind of unfair though, because the LLM takes input in the form of tokens rather than letters, so doing anything which requires reasoning on the level of letters is difficult. Incidentally, this might also be affecting it's ability to do arithmetic, LLaMA by comparison uses one token for each digit to avoid the issue (but of course suffers from the same problems with breaking words into characters).

3

Suspicious-Box- t1_jdzj7wr wrote

Just need training for that. Its amazing but what could it do with camera vision into the world and a robot body. Would it need specific training or could it brute force its way to moving a limb. The model would need to be able to improve itself real time though.

0