currentscurrents

currentscurrents t1_j2f996k wrote

Attention maps can be a type of explanation.

It tells you what the model was looking at when it generated a word or identified an image, but it doesn't tell you why it looked at those bits or why it made the decision it did. You can get some useful information by looking at it, but not everything you need to explain the model.

10

currentscurrents t1_j2csenb wrote

The number of layers is a hyperparameter, and people do optimization to determine the optimal values for hyperparameters.

Model size does seem to be a real scaling law. It's possible that we will come up with better algorithms that work on smaller models, but it's also possible that neural networks need to be big to be useful. With billions of neurons and an even larger number of connections/parameters, the human brain is certainly a very large network.

3

currentscurrents t1_j2cm36p wrote

TL;DR they want to take another language model (Google’s PaLM) and do Reinforcement Learning with Human Feedback (RLHF) on it like OpenAI did for ChatGPT.

At this point they haven't actually done it yet, since they need both compute power and human volunteers to do the training:

>Human volunteers will be employed to rank those responses from best to worst, using the rankings to create a reward model that takes the original model’s responses and sorts them in order of preference, filtering for the top answers to a given prompt.

>However, the process of aligning this model with what users want to accomplish with ChatGPT is both costly and time-consuming, as PaLM has a massive 540 billion parameters. Note that the cost of developing a text-generating model with only 1.5 billion parameters can reach up to $1.6 million.

Since it has 540b parameters, you will still need a GPU cluster to run it.

81

currentscurrents t1_j2by81g wrote

So, if I'm understanding right:

  • Backwards chaining is an old classical algorithm for logic proving.

  • They've implemented backwards chaining using a bunch of language models, so it works well with natural text.

  • Given a knowledge base (which are available as datasets these days), it can decompose a statement and check if it's logically consistent with that knowledge.

  • The reason they're interested in this is to use it as a training function to make language models more accurate.

This is effectively an old "expert system" from the 70s built out of neural networks. I wonder what other classical algorithms you can implement with neural networks.

I also wonder if you could use this to create its own knowledge base from internet data. Since the internet is full of contradicting information, you would have to compare new data against existing data somehow and decide which to keep.

8

currentscurrents t1_j21fh9o wrote

The big thing these days is "self-supervised" learning.

You do the bulk of the training on a simpler task, like predicting missing parts of images or sentences. You don't need labels for this, and it allows the model to learn a lot about the structure of the data. Then you fine-tune the model with a small amount of labeled data for the specific task you want it to do.

Not only does this require far less labeled data, it also lets you reuse the model - you don't have to repeat the first phase of training, just the fine-tuning. You can download pretrained models on huggingface and adapt them to your specific task.

15

currentscurrents t1_iybz6a1 wrote

I do agree that current ML systems require much larger datasets than we would like. I doubt the typical human hears more than a million words of english in their childhood, but they know the language much better than GPT-3 does after reading billions of pages of it.

> What is holding back AI/ML is to continue to define intelligence the way Turing did back in 1950 (making machines that can pass as human)

But I don't agree with this. Nobody is seriously using the Turing test anymore, these days AI/ML is about concrete problems and specific tasks. The goal isn't to pass as human, it's to solve whatever problem is in front of you.

8

currentscurrents t1_itomb89 wrote

This is a good idea but it wouldn't completely solve the problem. There are countries with opt-out policies, and they do have higher donation rates, but the demand still exceeds the supply. This isn't going to change as long as the leading cause of death is old age.

Technology is the only answer here; xenotransplantation or organ cloning. Right now xenotransplantation is much more promising - just this year, a genetically-altered pig heart was successfully transplanted into a human. We are going to see a lot more clinical trials in the very near future.

36

currentscurrents t1_isw0ejn wrote

Most health insurance covers telehealth therapy at $0 through a partnered provider. Check your insurer's website for details.

If you don't have insurance you're SOL, but then it is America what did you expect.

1

currentscurrents t1_isplb43 wrote

>Plus the poverty level data isn't updated according to sky high inflation either i.e. the income bracket you should fall in to be considered poor here.

This isn't an income-based measure of poverty like you'd use in the US, they're measuring access to real goods like food and cooking fuel. How many calories are they eating a day, do they have access to clean drinking water, etc.

We may see an temporary increase in poverty over the next few years if there is a global recession, but the long-term trendline shouldn't change.

11

currentscurrents t1_ispku2f wrote

Did you read any of that at all?

>Some had huge improvements to drinking water, others to education, attendance, and years in education, others to things like electricity and cooking fuel, others in housing + assets, and most regions saw big improvements to nutrition/caloric intake and very little improvement to child mortality.

3