Recent comments in /f/singularity

gahblahblah t1_j8c9oc4 wrote

Okay. You have rejected this theory of mind test. How about rephrasing/replacing this test with your own version that doesn't suffer the flaws you describe?

I ask this, because I have a theory that whenever someone post a test result like this, there are other people that will always look for an excuse to say that nothing is shown.

11

WithoutReason1729 t1_j8c97b1 wrote

GPT-2 XL is 1.5 billion parameters. Unless they added some very computationally expensive change to this new model that's unrelated to the parameter count, this could definitely run on consumer hardware. Very very cool!

21

Superschlenz t1_j8c86vx wrote

Can't explain it yet. Will have to think the next 17 years about it.

In the meantime, I use uBlock and this custom filter rule to make plain again whatever web designers create or become:

com,org,net,edu,info,ai,io,google,de,at,uk##body,p,li,.text,.commtext,div.u-gap-small:style(font-size:16px !important; font-family:sans-serif !important; font-weight:400 !important; line-height:23px !important)
1

vtjohnhurt t1_j8c7qrv wrote

Could the AI just be rehashing a large number of similar posts that it read on r/relationships?

If there is actual insight and reasoning here, I find it flawed. Your input only states that Bob wears the shirt whenever Sandra is home. Speaking as a human, I do not conclude that he takes the shirt off as soon as she leaves.

I'm a human and I've been married twice. One time we got a dog after the wedding. The second time I had a dog before the wedding. The fact that Sandra and Bob married suggests that they both feel positive about dogs. Feelings about pets is a fundamental point of compatibility in a relationship and it probably as important as have a common interest in making babies.

−2

Hazzman t1_j8c6v7u wrote

Here's the thing - all of these capabilities already exist. It's just about plugging in the correct variants of technology together. If something like this language model is the user interface of an interaction, something like Wolfram Alpha or a medical database becomes the memory of the system.

Literally plugging in knowledge.

What we SHOULD have access to is the ability for me at home to plug in my blood results and ask the AI "What are some ailments or conditions I am likely to suffer from in the next 15 years. How likely will it be and how can I reduce the likely hood?"

The reason we won't have access to this is 1) It isn't profitable for large corporations who WILL have access to this with YOUR information 2) Insurance. It will raise ethical issues with insurance and preexisting conditions and on that platform, they will deny the public access to these capabilities. Which is of course ass backwards.

17

sickvisionz t1_j8c4kaf wrote

Nowhere in the text does it say that Bob only wears the shirt in front of Sandra. "Great!" is assumed to be bland an unenthusiastic but I'm not sure where that's coming from. There's no context provided as to Bob's response other than an exclamation point, which generally means the opposite of bland.

To it's credit, a lot of people would have thrown in their own biases and assumptions and heard what they wanted to hear as well.

2

blueSGL t1_j8c26i1 wrote

> Experimental Settings

> As the Multimodal-CoT task re- quires generating the reasoning chains and leveraging the vision features, we use the T5 encoder-decoder architec- ture (Raffel et al., 2020). Specifically, we adopt UnifiedQA (Khashabi et al., 2020) to initialize our models in the two stages because it achieves the best fine-tuning results in Lu et al. (2022a). To verify the generality of our approach across different LMs, we also employ FLAN-T5 (Chung et al., 2022) as the backbone in Section 6.3. As using im- age captions does not yield significant performance gains in Section 3.3, we did not use the captions. We fine-tune the models up to 20 epochs, with a learning rate of 5e-5. The maximum input sequence length is 512. The batch sizes for the base and large models are 16 and 8,respectively. Our experiments are run on 4 NVIDIA Tesla V100 32G GPUs.

So the GPUs were used in training, there is nothing to say what the system requirements will be for inference.

25

el_chaquiste t1_j8c1z83 wrote

If I understand well, seems the input set (a science exam with solved exercises and detailed responses) is smaller than GPT3.5's own, but it overperforms GPT3.5 and humans on solving problems similar to those from said exam by some percent, more if it has a multimodal training including visual data.

I honestly don't know if we should get overly excited over this or not, but it seems like it would allow the creation of smaller models focused on some scientific and technical domains, with better accuracy in their reponses than generalist LLMs.

33