Recent comments in /f/MachineLearning

polygon_primitive t1_je0x04y wrote

For finding answers it's about the same as Google, sometimes better if you then verify the result with external sources, but that's mainly because Google has so badly corrupted their core search product while chasing profit. It's been pretty useful for me for doing the grunt work writing boiler plate code and refactoring stuff tho

3

meister2983 t1_je0s90f wrote

GPT-4 is an extremely good pattern matcher - probably one of the best ever made. Most exams made seem to be able to executed with straight-forward pattern matching (with no backtracking). The same thing applies to basic coding questions - it reasonably performs at the level of a human gluing stack overflow solutions together (with the obvious variable renaming/moving lines around/removing dead code/etc.)

It struggles at logical reasoning (when it can't "pattern match" the logical reasoning to something it's trained on).

Coding example:

  • Had no problem writing a tax calculator for ordinary income with progressive tax brackets
  • It struggles to write a program to calculate tax on long term capital gains (US tax code), which is very similar to the above, except has an offset (you start bracket indexing at ordinary income). I'd think this is actually pretty easy for a CS student especially if they saw the solution above -- GPT4 struggled though as it doesn't really "reason" about code the way a human would and would generate solutions obviously wrong to a human.
14

antonivs t1_je0pb85 wrote

Our product involves a domain-specific language, which customers typically interface to via a web UI, to control the behavior of execution. The first model this guy trained involved generating that DSL so customers could enter a natural language request and avoid having to go through a multi-step GUI flow.

They've tried using it for docs too, that worked well.

2

dancingnightly t1_je0o082 wrote

The benefit of finetuning or training your own text model (e.g. in the olden days on BERT), now through the OpenAI API vs the benefit of just using contextual semantic search is reducing day-by-day... especially with the extended context window of GPT-4.

If you want something in house, finetuning GPT-J or so could be the way to go, but it's definitely not the career direction I'd take.

2

ianitic t1_je0mjqx wrote

Oh I haven't tested this on textbooks, but I have asked chatGPT to give me pages of a novel and it did word for word. I suspect it had to have trained on PDFs? I'm highly surprised I haven't seen any news of authors/publishers suing yet tbh.

It is obvious when a book is a part of its training set or not though based on the above test.

10