silva_p t1_jax6viy wrote on March 4, 2023 at 7:34 PM

Reply to Meta’s LLaMa weights leaked on torrent... and the best thing about it is someone put up a PR to replace the google form in the repo with it 😂 by RandomForests92

219GB? Oh dear!

vk6flab t1_jax2hd7 wrote on March 4, 2023 at 7:04 PM

Reply to comment by PepperoniMozz in Meta’s LLaMa weights leaked on torrent... and the best thing about it is someone put up a PR to replace the google form in the repo with it 😂 by RandomForests92

I'm guessing that someone got hold of the thing that you needed to submit a Google Form for and decided that they could instead distribute it via a torrent. They updated the source code as a community service to show the torrent to the next poor sod who went looking through the source to get rid of the Google Form.

But, I'm not the developer, I'm not sure what actually happened, but it seems plausible.

diepala OP t1_jaw16wx wrote on March 4, 2023 at 2:47 PM

Reply to comment by goedel777 in General NN Architecture guidelines for a regression problem with tabular data by diepala

Because the problem will benefit from doing linear operations, and under certain circumstances the output of the model is almost equal to one of the input features. This is harder to generalize with tree based models.

Jaffa6 t1_javzwj6 wrote on March 4, 2023 at 2:36 PM

Reply to comment by inFamous_16 in [R] Variable size input to pre-trained BERT model by inFamous_16

No worries, shoot me a message if you need a hand!

5pitt4 t1_javz2wi wrote on March 4, 2023 at 2:30 PM

Reply to Meta’s LLaMa weights leaked on torrent... and the best thing about it is someone put up a PR to replace the google form in the repo with it 😂 by RandomForests92

Noice

OzzyKampha t1_javy6i8 wrote on March 4, 2023 at 2:23 PM

Reply to Meta’s LLaMa weights leaked on torrent... and the best thing about it is someone put up a PR to replace the google form in the repo with it 😂 by RandomForests92

Sadly not alive

PepperoniMozz t1_javx5wi wrote on March 4, 2023 at 2:14 PM

Reply to Meta’s LLaMa weights leaked on torrent... and the best thing about it is someone put up a PR to replace the google form in the repo with it 😂 by RandomForests92

i an clueless. could someone explain the whole shebang?

goedel777 t1_javtr0h wrote on March 4, 2023 at 1:45 PM

Reply to comment by diepala in General NN Architecture guidelines for a regression problem with tabular data by diepala

How do you know you can get better results?

inFamous_16 OP t1_javmu8a wrote on March 4, 2023 at 12:34 PM

Reply to comment by Jaffa6 in [R] Variable size input to pre-trained BERT model by inFamous_16

ohh ok... super clear, Thanks for your time! I will check this out

Jaffa6 t1_javl6ef wrote on March 4, 2023 at 12:15 PM

Reply to comment by inFamous_16 in [R] Variable size input to pre-trained BERT model by inFamous_16

No problem.

I believe that if you're using a BERT-esque model, you do indeed need to do "full" tokenisation (part of which is creating the attention mask and padding) because BERT expects its input to be a list of token indices. E.g. Given the token mapping {"a": 1, "cow": 2, "cat": 3, "dog": 4}, tokenisation would turn "a cat" into [1, 3] which is in the form that BERT expects.

And since BERT comes with a token mapping (due to pre-training), if you're just putting in your own features (say, number of likes and number of retweets), they'll quite possibly just get interpreted as random tokens if their numbers match up with known token indices.

If your features are already the right kind (tokenised text, with the resultant indices matching the correct BERT token indices), I suppose you could do truncation/padding yourself and feed that input directly to BERT.

But it'll probably end up simpler and less error-prone to let BERT tokenise it for you (e.g. via HuggingFace's `AutoTokenizer.from_pretrained('bert-base')`)

issam_28 t1_javazog wrote on March 4, 2023 at 9:55 AM

Reply to Meta’s LLaMa weights leaked on torrent... and the best thing about it is someone put up a PR to replace the google form in the repo with it 😂 by RandomForests92

That was bound to happen.

diepala OP t1_javaont wrote on March 4, 2023 at 9:51 AM

Reply to comment by big_ol_tender in General NN Architecture guidelines for a regression problem with tabular data by diepala

I already use that, but it is not giving me the results I want, and I know I can get better performance.

mumbo1134 t1_jav893u wrote on March 4, 2023 at 9:16 AM

Reply to comment by DingWrong in Meta’s LLaMa weights leaked on torrent... and the best thing about it is someone put up a PR to replace the google form in the repo with it 😂 by RandomForests92

no shortage of seeders on the original torrent

nirajandhakal37 t1_jav7o33 wrote on March 4, 2023 at 9:07 AM

Reply to Meta’s LLaMa weights leaked on torrent... and the best thing about it is someone put up a PR to replace the google form in the repo with it 😂 by RandomForests92

Chad Moment🤣

inFamous_16 OP t1_jav6112 wrote on March 4, 2023 at 8:44 AM

Reply to comment by Jaffa6 in [R] Variable size input to pre-trained BERT model by inFamous_16

Ahhh... thank you! I wasn't aware of the concept attention mask. Also I had one more doubt, As I already have tweet features of variable size after concatenation, Is there a way to skip the tokenization step because I don't require it? I only need padding and attention mask.

Jaffa6 t1_jav3gj2 wrote on March 4, 2023 at 8:08 AM

Reply to comment by inFamous_16 in [R] Variable size input to pre-trained BERT model by inFamous_16

I believe that you don't really lose the context because you also have an attention mask which basically says "don't pay attention to these tokens" and every pad token is masked in it.

inFamous_16 OP t1_jauvj21 wrote on March 4, 2023 at 6:26 AM

Reply to comment by I_will_delete_myself in [R] Variable size input to pre-trained BERT model by inFamous_16

yeah thanks... That's the first thought came into my mind but isn't that way we will lose the context of original feature vector?

I_will_delete_myself t1_jauuhhi wrote on March 4, 2023 at 6:13 AM

Reply to [R] Variable size input to pre-trained BERT model by inFamous_16

You add padding

DingWrong t1_jauuc0z wrote on March 4, 2023 at 6:11 AM

Reply to Meta’s LLaMa weights leaked on torrent... and the best thing about it is someone put up a PR to replace the google form in the repo with it 😂 by RandomForests92

Now just if there were seeders in that torrent..

Pikalima t1_jau7ujc wrote on March 4, 2023 at 2:39 AM

Reply to Meta’s LLaMa weights leaked on torrent... and the best thing about it is someone put up a PR to replace the google form in the repo with it 😂 by RandomForests92

King.

big_ol_tender t1_jatvx64 wrote on March 4, 2023 at 1:03 AM

Reply to General NN Architecture guidelines for a regression problem with tabular data by diepala

If you have tabular data just use xgboost, forget the nn

djaym7 t1_jatu0sf wrote on March 4, 2023 at 12:47 AM

Reply to Meta’s LLaMa weights leaked on torrent... and the best thing about it is someone put up a PR to replace the google form in the repo with it 😂 by RandomForests92

Lmao, this is awesome!

rdmanoftheyear t1_jatn7gd wrote on March 3, 2023 at 11:55 PM

Reply to Meta’s LLaMa weights leaked on torrent... and the best thing about it is someone put up a PR to replace the google form in the repo with it 😂 by RandomForests92

Madlad

average-joee OP t1_jasr95t wrote on March 3, 2023 at 8:14 PM

Reply to comment by fundamental_entropy in What do you recommend for a text summarization task? by average-joee

Many thanks for your input!

fundamental_entropy t1_jasqy64 wrote on March 3, 2023 at 8:12 PM

Reply to comment by average-joee in What do you recommend for a text summarization task? by average-joee

Flan models are trained in almost every open dataset available in Generic English tasks. Recent research suggests models trained to perform multiple tasks (in fact ratios of different tasks too affect see flan 2022 paper) are better than models trained only on a given task. Flan T5 beats T5 in almost every task and sometimes Flan T5 XXL matches gpt3 type of prompt generation.

Recent comments in /f/deeplearning