impossiblefork t1_je5l8k9 wrote on March 29, 2023 at 4:02 PM

Reply to [D] Do model weights have the same license as the modem architecture? by murphwalker

How would either an architecture or a model be copyrightable?

Architectures are algorithms. If they aren't patentable and are in addition to that patented, they have no protection.

Model weights are a result of a mechanical procedure that fits a model to data, minimising some kind of error. That is not a work of human authorship.

Things that could be copyrightable are an article describing a model architecture, or a specific software implementation of a model.

As an argument why model weights are unlikely to be copyrightable consider the following parallel: we know that model output, for example, a story generated by ChatGTP based on a prompt is certainly not copyrightable, since it's not a work of human authorship, but then, how is the model? We can view the selection of training examples as something similar to a prompt and the training process as similar to the inference. I think giving copyright protection to model weights might be reasonable though, but I think it's unlikely that they have copyright protection.

RicketyCricket t1_je5ky7l wrote on March 29, 2023 at 4:00 PM

Reply to comment by DigThatData in [D] Alternatives to fb Hydra? by alyflex

As the developer of Spock (posted in another comment) -- OmegaConf is also an awesome choice and super useful. I'd suggest checking it out too!

You can go even closer to metal and use the attrs library as well (https://www.attrs.org/en/stable/)

Extreme_Photo t1_je5kuh4 wrote on March 29, 2023 at 4:00 PM

Reply to [Discussion] IsItBS: asking GPT to reflect x times will create a feedback loop that causes it to scrutinize itself x times? by RedditPolluter

Can you give an example of a use of reflection showing the prompt and the response?

RicketyCricket t1_je5kgy4 wrote on March 29, 2023 at 3:57 PM

Reply to comment by RicketyCricket in [D] Alternatives to fb Hydra? by alyflex

second favorite:

https://fidelity.github.io/spock/advanced_features/Post-Hooks

Basically lets you do any validation necessary on your configs. Spock provides some basics (greater than, within bounds, etc) but it's totally up to the user via any simple asserts or validation functions a user wants to write.

DigThatData t1_je5kfoc wrote on March 29, 2023 at 3:57 PM

Reply to [D] Alternatives to fb Hydra? by alyflex

go closer to the metal and use omegaconf directly.

RicketyCricket t1_je5jchr wrote on March 29, 2023 at 3:50 PM

Reply to comment by RicketyCricket in [D] Alternatives to fb Hydra? by alyflex

This being my favorite hidden one:

https://fidelity.github.io/spock/advanced_features/Evolve#maintaining-cli-and-python-api-configuration-parity

RicketyCricket t1_je5j2n9 wrote on March 29, 2023 at 3:48 PM

Reply to comment by _Arsenie_Boca_ in [D] Alternatives to fb Hydra? by alyflex

Most of the cool stuff is buried in the docs under advanced features :-)

https://fidelity.github.io/spock/advanced_features/Composition

(full transparency I'm the author/maintainer/core-developer. I know the docs need a re-org to surface more of the useful features)

ianitic t1_je5j1jo wrote on March 29, 2023 at 3:48 PM

Reply to [D] The best way to train an LLM on company data by jaxolingo

You might like something like this as you use azure: https://azure.microsoft.com/en-us/products/bot-services/

mr_house7 t1_je5iuk0 wrote on March 29, 2023 at 3:47 PM

Reply to comment by obolli in [N] OpenAI may have benchmarked GPT-4’s coding ability on it’s own training data by Balance-

Microsoft is the one in charge now.

Peantoo t1_je5iks3 wrote on March 29, 2023 at 3:45 PM

Reply to [R] You Only Segment Once: Towards Real-Time Panoptic Segmentation [CVPR 2023] by Technical-Vast1314

I understand semantic segmentation but it's been a while since I've tinkered in the computer vision space. Can you explain the push for panoptic segmentation and why you think it's a valuable technology? What can it accomplish that really good semantic segmentation can't?

master-leaf t1_je5hhu6 wrote on March 29, 2023 at 3:38 PM

Reply to comment by jaxolingo in [D] The best way to train an LLM on company data by jaxolingo

I would check the paper, but I think they fine tune a pre trained local LM. They also created their own encodings to account for the structure of tabular data, such as the column headers, entity rows etc.

I will note though, from what I remember the table sizes were pretty small.

xnalonali t1_je5goxk wrote on March 29, 2023 at 3:33 PM

Reply to [R] You Only Segment Once: Towards Real-Time Panoptic Segmentation [CVPR 2023] by Technical-Vast1314

What license will you use for this implementation?

[deleted] t1_je5g7vb wrote on March 29, 2023 at 3:30 PM

Reply to [D] With ML tools progressing so fast, what are some ways you've taken advantage of them personally? by RedditLovingSun

[removed]

[deleted] t1_je5fdej wrote on March 29, 2023 at 3:25 PM

Reply to [D] Instruct Datasets for Commercial Use by JohnyWalkerRed

[removed]

thedamian t1_je5eweg wrote on March 29, 2023 at 3:22 PM

Reply to comment by thomasahle in [D] Simple Questions Thread by AutoModerator

Before answering the question, I would submit that you should be thinking of keeping your models behind an api. No need to have it sitting on the client side (which is why it feels you're asking the quesiton)

And behind an API it can be as big as you'd like or can afford on your server)

jaxolingo OP t1_je5eunu wrote on March 29, 2023 at 3:21 PM

Reply to comment by master-leaf in [D] The best way to train an LLM on company data by jaxolingo

From Hugging Face?

jaxolingo OP t1_je5eovq wrote on March 29, 2023 at 3:20 PM

Reply to comment by TheDeviousPanda in [D] The best way to train an LLM on company data by jaxolingo

The end goal would be to add it into the products current chat in the web app, so I can't be doing that :)

master-leaf t1_je5dtrm wrote on March 29, 2023 at 3:15 PM

Reply to [D] The best way to train an LLM on company data by jaxolingo

There was a paper I read a few months ago (I think it was called tapas). In this paper they show how to ingest tabular data to a transformer model.

TheDeviousPanda t1_je5ddm3 wrote on March 29, 2023 at 3:12 PM

Reply to [D] The best way to train an LLM on company data by jaxolingo

It’s going to be a lot easier to just take something like GPT-4 and feed in your data directly and ask questions.

_Arsenie_Boca_ t1_je5d04j wrote on March 29, 2023 at 3:09 PM

Reply to comment by RicketyCricket in [D] Alternatives to fb Hydra? by alyflex

Looks interesting, a bit more lightweight than hydra. But also misses a lot of cool features like composing multiple yaml configs

OldManSaluki t1_je5cmjw wrote on March 29, 2023 at 3:07 PM

Reply to comment by keepthepace in [D] Do model weights have the same license as the modem architecture? by murphwalker

I'm leaning in this direction myself.

IANAL, but I think about the Feist Publications ruling which dealt with raw listings of facts (white pages names, addresses, phone numbers organized in the most functional format - alphabetic.) SCOTUS ruled that the raw data was not copyrightable even though it took a lot of effort to collect and compile it. It seems to be that the raw data here are the weights which would make them not copyrightable. The structural design of the model might be, and more than likely the compiled model with weights would be copyrightable.

I suspect this will work its way through the courts just in time to be rendered moot.