Recent comments in /f/MachineLearning

Sure_Cicada_4459 t1_je5w1bh wrote

Spin off project based on reflection, apparently GPT-4 gets 20% improvement in coding tasks: https://github.com/GammaTauAI/reflexion-human-eval

People finetuning Llama using this prompt structure with much better results: https://twitter.com/Orwelian84/status/1639859947948363777?s=20

Someone already build an autonomous agent using feedback loops (not necessary related to reflexion): https://twitter.com/yoheinakajima/status/1640934493489070080

Seems to yield performance improvement up to a certain point obviously, but it's also a very basic prompt stucture overall one can image all kinds of "cognitive structures"

8

patniemeyer t1_je5v9m7 wrote

Yes, in fact OpenAI offers an API for this right now: https://platform.openai.com/docs/guides/fine-tuning

It *appears* from the terminology that they are using that they are actually performing training on top of their model with your data (which you supply in json). They talk about learning rate and epochs, etc. as params, however I have not seen a real doumentation of what they are doing.

2

ortegaalfredo t1_je5urre wrote

I run a discord with all models. Currently only 30B and 65B because nobody uses the smaller LLMs.

Even if superficially they both can answer questions, in complex topics 65B is much better than 30B, so not even compares with 7B.

11

SkinnyJoshPeck t1_je5ue3b wrote

I'm not 100% sure what your infrastructure or background is, but generally you can just transform data to whatever data format works best for the model.

So, you would build a pipeline that goes

 Snowflake -> Some ETL process -> Transformed Data Storage -> Model Training -> Model Saving -> Model Loading for API to ask questions

where that Some ETL process is a process that transforms your data to whatever the model needs, and your model trains from that.

For example, on AWS you might have something like

Redshift/RDS/Whatever -> SageMaker -> Output Model to S3 -> API for your model or something idk

or if it's all going to be on-prem and you won't have Cloud tech, you'd do something like

Snowflake/Azure/Any Data Source -> Airflow for running training -> Model Upload to Some Folder -> API in a docker container in Kubernetes or something for users to hit

or they can just download the model locally and use some script to ask it questions, I'm not 100% sure it all depends on the model/language/etc that you use.

This is a fairly complicated task; if your company is getting serious about this, y'all should hire someone who is an ML engineer to do this task. :)

32

james_mclellan t1_je5ru4r wrote

Two questions :

(1) Does anyone create missing data when constructing models? Examples - searchjng for stronger relationships between data set and first and second derivatives of time series data, compairsons to same day of week last N periods, same holiday last N periods; examining distance to an urban center for geodata

(2) Does anyone use a model that falls back on functions when a match is not 100%? For example, "apple" may mean fruit, music, machines, music companies or machine companies -- instead of a number 0 to 1 of the probable meaning, does anyone use models where the code "performs a test" to better disambiguate?

2

Technical-Vast1314 OP t1_je5oqm5 wrote

OK, panoptic segmentation means doing two kinds of segmentation task together: semantic segmentation and instance segmentation. The semantic segmentation can only segment things like "sky", "car", "person", but it's hard to segment each instance. And instance segmentation is like object detection, which means it will predict a box with a mask on an instance~

6

mike94025 t1_je5nrdi wrote

You’re looking in the wrong place. What you’re looking at is the BT gen1 fastpath, not the BT gern 2 custom kernels.

You need to look at F.multi_head_attention_forward().

The fastpath still services inference until a full rewrite of activation.py for now that will hopefully be refactored in a future release. (There’s always a tension between refactoring and introducing new features under a tone and staffing constrained problem formulation.)

1

mike94025 t1_je5mfa8 wrote

This doesn't force it. It says that flash is enabled, and stone others. To force it, you have to disable all other kernels. Then it’s flash or bust.

You can find more in our blog which got published today and the SDPA tutorial. Both are linked here https://www.linkedin.com/posts/michael-gschwind-3704222_pytorch-activity-7046773418288955393-gOSh

PS: the context manager can be used anywhere outside the call as well, including around the call to model.forward.

2