Recent comments in /f/MachineLearning

2blazen t1_je6axjp wrote

>- Creative brainstorming for professional work

I struggle with this, I was trying to get it it help me come up with interesting thesis research questions in a very specific audioML field, but it failed to come up with anything original, and I don't know if there's a certain way I should have phrased my questions or it's just creative limitations

1

LetGoAndBeReal t1_je65ffo wrote

The comments here so far have addressed three possible approaches to this. Two of those approaches - ie training your own model and fine-tuning an existing model - are not currently viable. Training your model would require a ridiculous amount of human and compute power and not result in something where data could be easily added. Fine-tuning a model does not result in the model absorbing new data - it only conditions the output patterns from the model using data/knowledge the model gained during initial training.

The only viable approach is to use retrieval augmented generation, where data relating to user questions are retrieved from outside the model and fed to model as part of the prompt. Tools like LangChain can help you build a RAG solution on your own. There are also many services coming out that provide this sort of capability, such as humata.ai.

86

currentscurrents OP t1_je631oa wrote

TL;DR:

  • This is a survey paper. The authors summarize a variety of arguments about whether or not LLMs truly "understand" what they're learning.

  • The major argument in favor of understanding is that LLMs are able to complete many real and useful tasks that seem to require understanding.

  • The major argument against understanding is that LLMs are brittle in non-human ways, especially to small changes in their inputs. They also don't have a real-world experience to ground their knowledge in (although multimodal LLMs may change this).

  • A key issue is that no one has a solid definition of "understanding" in the first place. It's not clear how you would test for it. Tests intended for humans don't necessarily test understanding in LLMs.

I tend to agree with their closing summary. LLMs likely have a type of understanding, and humans have a different type of understanding.

>It could thus be argued that in recent years the field of AI has created machines with new modes of understanding, most likely new species in a larger zoo of related concepts, that will continue to be enriched as we make progress in our pursuit of the elusive nature of intelligence.

81

EverythingGoodWas t1_je612lg wrote

You aren’t going to train an LLM on company data. You could fine tune an existing one with company data, but creating an LLM from scratch is an absolutely massive compute task. If you are trying to make a closed domain question answering system, that uses your company’s data, you basically need to create a full pipeline from parsing, searching, and finally pushing the context and question to a language model.

15

abnormal_human t1_je60s31 wrote

Yes, it's totally possible to train an LLM to understand tabular data. It's a very general purpose architecture. With enough resources it is well suited to a wide range of problems, and yes, Azure/Snowflake can do everything you need (at some price, assuming you know what to do with them).

You need to make a decision about whether you want to bake the info into the LLM, or whether you want to teach the LLM to find the answers and then format them for humans.

This will depend on your use case, budget, team size, competencies, data set size, and time to market requirements. Baking the info into the LLM is a lot harder than doing the other thing, like potentially 100x-1000x harder and more expensive, and without people with experience doing it, you will waste a lot of time/energy getting there.

3

alyflex OP t1_je5yia4 wrote

That is certainly an option that I was considering, but then I would have to make my own job planner / multirunner, (which I actually already have done for my current project, but this whole refactoring was to try and move away from my own custom functions and try to use some more standardized methods)

1

alyflex OP t1_je5y5gh wrote

> https://github.com/fidelity/spock

This looks quite promising, and I like the Post hooks you linked below, but I do not see any way of running a series of experiment in a non-combinatoric way? There is Optuna api (though I can't tell whether early pruning is supported in this?), but I don't see any way of grouping parameters for a set of experiments.

2

patniemeyer t1_je5wxiv wrote

The pricing page lists GPT-4. I think it was just added in the past day or two. (I have not confirmed that you can actually access it though)

EDIT: When I query the list of models through their API I still do not see GPT4, so maybe it's not actually available yet... or maybe I'm querying the wrong thing.

1