Emotional-Fox-4285 OP t1_ivepyvf wrote on November 7, 2022 at 12:57 PM

By the other side, I meant the other side of the board, but let's explore your ideas a bit in the context of board game algorithms. In the case of the AlphaZero algorithm, the other opponent is itself. The neural network part of alpha zero acts as a sort of intuition engine, and it's trying to intuit 2 related but actually different things, 1 the value of a particular move, how good or bad it is, 2 which move AlphaZero itself will be likely to choose after it has thought about it for a long time. By thinking, I mean running many many simulated games from the current position, making random moves probabilistically weighted by intuition 1. This is the novel idea of the algorithm, and it allows it to drastically magnify the amount of data used to train the neural network. Instead of having to play an entire game to get 1 tiny bit of feedback it gets it for every possible move every turn, the network weights are updated based on how well it predicts its own behavior. There's growing evidence that animal brains do something similar, this is called the predictive processing model of cognition. Anyway, I want to point out that this very much seems like a theory of mind, except it's a theory not of another mind but if its own. BTW, AlphaZero becomes, after training, ridiculously good not only at predicting its own behavior but at predicting the value of a move. The go playing version can beat all but the very best professional players without doing any tree search whatsoever, in other words making moves using only a single pass along the NN part of the architecture (the intuition) and not looking even one move ahead, likewise it is remarkably accurate, though not perfectly so, at predicting its final decision after searching the game tree, so its conception of self is accurate.

Now there's another game playing engine called Maia, this is designed not to beat humans but to play like they do, and it's quite good at this. It can imitate play of very good amateurs all the way up to professionals. There's absolutely no reason this couldn't be integrated into the AlphaZero algorithm, providing it with not only a theory of its own mind but that of a (generic) human player. And if you don't like that generic part, there are engines fine tuned on single humans, usually professional players with a lot of games in the database. So basically, yes they are stimulus react models, always they will be, but they're complicated ones where the majority of the stimulus is generated internally, and probably so are humans. And they are capable even today of having a theory of mind by any reasonable definition of what that means.

sckuzzle t1_ivbjs9d wrote on November 6, 2022 at 7:30 PM

Reply to Training a board game player AI for an asymmetric game by computing_professor

If I were to approach this, I'd train them at the same time. You have two models - one for each side - each with their own reward functions. Then you'd train them in parallel, playing against each other as they go.

It's a bit of a challenge because you can only train them relative to the strength of the other - so you need them both to get "smarter" in order to continue their training. But that's no different than a model that self-trains against itself.

sckuzzle t1_ivbiskk wrote on November 6, 2022 at 7:23 PM

Reply to comment by dualmindblade in Training a board game player AI for an asymmetric game by computing_professor

> so it must understand both strategies about equally regardless of which side it's playing

What do you mean here by "understand"? My understanding is that the state-of-the-art AI has no concept of what the capabilities of its opponent are or even what its opponent might be thinking; it only understands how to react in order to maximize a score.

So while you could train it to react well no matter which side it is playing, how would it benefit from being able to play the other side better? It would need to spin up a duplicate of itself to play the other side and then analyze itself to understand what is happening, but then it would just get into an infinite loop as it's duplicate self spins up its own duplicate.

I guess what I'm getting at is that these AI algorithms have no theory of mind. They are simple stimulus-react models. Even the concept of an opposing player is beyond them - it'd be the same whether it was playing solitaire or chess.

LevKusanagi t1_ivbfxgc wrote on November 6, 2022 at 7:05 PM

Reply to comment by LevKusanagi in bought legion 7i: Intel i9 12th gen, rtx 3080 ti 16 gb vram, 32 GB ddr5. need some confirmation bias (or opposite) to understand if I made the right decision by macORnvidia

if not, you can replicate his testing with current SOTA algos that you care about

LevKusanagi t1_ivbfv4f wrote on November 6, 2022 at 7:05 PM

Reply to bought legion 7i: Intel i9 12th gen, rtx 3080 ti 16 gb vram, 32 GB ddr5. need some confirmation bias (or opposite) to understand if I made the right decision by macORnvidia

sorry if not direct advice but i recommend checking tim dettmers' posts on dl hardware. they are probably still relevant today

computing_professor OP t1_ivbanwf wrote on November 6, 2022 at 6:32 PM

Reply to comment by InfuriatinglyOpaque in Training a board game player AI for an asymmetric game by computing_professor

Yes, I have seen the hide and seek results and I didn't even consider it! That's a great example.

InfuriatinglyOpaque t1_ivb9otw wrote on November 6, 2022 at 6:26 PM

Reply to Training a board game player AI for an asymmetric game by computing_professor

In terms of papers using relevant tasks, the closest example I can think of might be the "hide and seek" paradigm used by Weihs et al. (2019), which I include in my list below (not my area of expertise - so there could easily be far more relevant papers out there that I'm not aware of). I wouldn't be the least bit surprised though if there were lots of relevant ideas that you could take from prior works using symmetric games as well, so I've also included a wide variety of other papers that all fall under the broad umbrella of modeling game-learning/game-behavior.

References

Aggarwal, P., & Dutt, V. (2020). Role of information about opponent’s actions and intrusion-detection alerts on cyber-decisions in cybersecurity games. Cyber Security: https://www.ingentaconnect.com/content/hsp/jcs/2020/00000003/00000004/art00008

Amiranashvili, A., Dorka, N., Burgard, W., Koltun, V., & Brox, T. (2020). Scaling Imitation Learning in Minecraft. http://arxiv.org/abs/2007.02701

Bramlage, L., & Cortese, A. (2021). Generalized Attention-Weighted Reinforcement Learning. Neural Networks. https://doi.org/10.1016/j.neunet.2021.09.023

Frey, S., & Goldstone, R. L. (2013). Cyclic Game Dynamics Driven by Iterated Reasoning. PLoS ONE, 8(2), e56416. https://doi.org/10.1371/journal.pone.0056416

Guennouni, I., & Speekenbrink, M. (n.d.). Transfer of Learned Opponent Models in Zero Sum Games. 8. Hawkins, R. D., Frank, M. C., & Goodman, N. D. (2020). Characterizing the dynamics of learning in repeated reference games. Cognitive Science, 44(6), e12845. http://arxiv.org/abs/1912.07199

Kumaran, V., Mott, B. W., & Lester, J. C. (2019.). Generating Game Levels for Multiple Distinct Games with a Common Latent Space. 7. https://ojs.aaai.org/index.php/AIIDE/article/view/7418

Lampinen, A. K., & McClelland, J. L. (2020). Transforming task representations to perform novel tasks. Proceedings of the National Academy of Sciences, 117(52), 32970–32981. https://doi.org/10.1073/pnas.2008852117

Lensberg, T., & Schenk-Hoppé, K. R. (2021). Cold play: Learning across bimatrix games. Journal of Economic Behavior & Organization, 185, 419–441. https://doi.org/10.1016/j.jebo.2021.02.027

Schwarzer, M., Rajkumar, N., Noukhovitch, M., Anand, A., Charlin, L., Hjelm, D., Bachman, P., & Courville, A. (2021). Pretraining Representations for Data-Efficient Reinforcement Learning. http://arxiv.org/abs/2106.04799

Sibert, C., Gray, W. D., & Lindstedt, J. K. (2017). Interrogating Feature Learning Models to Discover Insights Into the Development of Human Expertise in a Real-Time, Dynamic Decision-Making Task. Topics in Cognitive Science, 9(2), 374–394. https://doi.org/10.1111/tops.12225

Spiliopoulos, L. (2013). Beyond fictitious play beliefs: Incorporating pattern recognition and similarity matching. Games and Economic Behavior, 81, 69–85. https://doi.org/10.1016/j.geb.2013.04.005

Spiliopoulos, L. (2015). Transfer of conflict and cooperation from experienced games to new games: A connectionist model of learning. Frontiers in Neuroscience, 9. https://doi.org/10.3389/fnins.2015.00102

Stanić, A., Tang, Y., Ha, D., & Schmidhuber, J. (2022). Learning to Generalize with Object-centric Agents in the Open World Survival Game Crafter (No. arXiv:2208.03374). arXiv. http://arxiv.org/abs/2208.03374

Tsividis, P. A., Loula, J., Burga, J., Foss, N., Campero, A., Pouncy, T., Gershman, S. J., & Tenenbaum, J. B. (2021). Human-Level Reinforcement Learning through Theory-Based Modeling, Exploration, and Planning. http://arxiv.org/abs/2107.12544

Weihs, L., Kembhavi, A., Han, W., Herrasti, A., Kolve, E., Schwenk, D., Mottaghi, R., & Farhadi, A. (2019). Artificial Agents Learn Flexible Visual Representations by Playing a Hiding Game. http://arxiv.org/abs/1912.08195

Zheng, Z. (Sam)., Lin, X. (Daisy)., Topping, J., & Ma, W. J. (2022). Comparing Machine and Human Learning in a Planning Task of Intermediate Complexity. Proceedings of the Annual Meeting of the Cognitive Science Society, 44(44). https://escholarship.org/uc/item/8wm748d8

arhetorical t1_ivb2sl1 wrote on November 6, 2022 at 5:41 PM

Reply to comment by macORnvidia in bought legion 7i: Intel i9 12th gen, rtx 3080 ti 16 gb vram, 32 GB ddr5. need some confirmation bias (or opposite) to understand if I made the right decision by macORnvidia

I haven't tried doing that but if it's a similar resource requirement to prototyping (like if you'll be working with a pretrained model, not training one) then it should be fine. Again though, the biggest factor is whether you like it and if it works for you - since you bought a laptop instead of a workstation, you must have had a very good reason for needing one and none of us can answer that question for you. If you're not training, as long as your stuff fits in memory the specs don't matter that much.

computing_professor OP t1_ivaak87 wrote on November 6, 2022 at 2:30 PM

Reply to comment by dualmindblade in Training a board game player AI for an asymmetric game by computing_professor

Thanks!

dualmindblade t1_iva9p3g wrote on November 6, 2022 at 2:24 PM

Reply to comment by computing_professor in Training a board game player AI for an asymmetric game by computing_professor

I would suggest reading the original alphago paper, it's extremely digestible, then skim the AlphaZero one, less detail there because it's a very similar architecture and actually it is simpler than the original. Think of AlphaZero as a scheme for improving the loss function, the actual architecture of the NN part is sort of unimportant, you can think of it as a black box, or maybe a black box with two smaller boxes sticking out of it.

computing_professor OP t1_iva7s6f wrote on November 6, 2022 at 2:10 PM

Reply to comment by dualmindblade in Training a board game player AI for an asymmetric game by computing_professor

Cool, thanks for the reply. With chess, I always assumed it was just examining the state as a pair (board,turn), regardless of who went first. I study the mathematics of combinatorial games and it's rare to ever consider who moves first, as it's almost always more interesting to determine the best move for any given game state.

Do you have any reading suggestions for understanding AlphaZero? I've read surface level/popular articles, but I'm a mathematician and would like to dig deeper into it. And, of course, learn how to apply it in my case.

dualmindblade t1_iva5ncr wrote on November 6, 2022 at 1:54 PM

Reply to Training a board game player AI for an asymmetric game by computing_professor

Disclaimer: not even close to an expert, I just keep up with the state of the field

If you were using something the AlphaZero algorithm, I'm fairly certain the asymmetry is not an issue, it would work unmodified, and also I don't think you'd want to use two models, it would weaken the play. Argument is that the NN part is trying to intuit the properties of a big tree search in which both players are participants, so it must understand both strategies about equally regardless of which side it's playing. It's no different from a human player, when you make a move the next step is to consider the resulting position from the other side and evaluate their potential moves. BTW chess is not super symmetric in practice, usually black will need to adopt a defensive strategy in the opening.

Zer01123 t1_iv9uq1w wrote on November 6, 2022 at 12:16 PM

Reply to bought legion 7i: Intel i9 12th gen, rtx 3080 ti 16 gb vram, 32 GB ddr5. need some confirmation bias (or opposite) to understand if I made the right decision by macORnvidia

For what you want it to do, it is probably more than enough performance. Maybe even an overkill if you just want to play around with deep learning.

Remember that depending on what you want to do, you might need a big SSD for all the datasets, and laptops usually come with quite small ones in the default configuration.

cma_4204 t1_iv9ojml wrote on November 6, 2022 at 11:03 AM

Reply to bought legion 7i: Intel i9 12th gen, rtx 3080 ti 16 gb vram, 32 GB ddr5. need some confirmation bias (or opposite) to understand if I made the right decision by macORnvidia

Laptops are great for prototyping, the one you mentioned will be fine. If you start to work with large models and max it out then just use cloud computing for those large train jobs

Competitive-Good4690 OP t1_iv8u5f8 wrote on November 6, 2022 at 4:22 AM

Reply to comment by sgjoesg in U-Net architecture by Competitive-Good4690

Thanks G

macORnvidia OP t1_iv8ssle wrote on November 6, 2022 at 4:09 AM

Reply to comment by arhetorical in bought legion 7i: Intel i9 12th gen, rtx 3080 ti 16 gb vram, 32 GB ddr5. need some confirmation bias (or opposite) to understand if I made the right decision by macORnvidia

>An external GPU will just make your setup less portable without actually giving you the performance of a workstation

Can you please elaborate?

Also, as for training, I get it, I can't really train deep learning models but how about optimizing machine learning models using pyCUDA?

arhetorical t1_iv8nb2t wrote on November 6, 2022 at 3:21 AM

Reply to comment by macORnvidia in bought legion 7i: Intel i9 12th gen, rtx 3080 ti 16 gb vram, 32 GB ddr5. need some confirmation bias (or opposite) to understand if I made the right decision by macORnvidia

The only thing that matters is if you like it. The specs really don't matter that much. Either you'll be prototyping your model, in which case you'll just be training for an epoch or two and having better specs will only save you a little bit of time, or you'll be training it in which case a laptop is not going to cut it. An external GPU will just make your setup less portable without actually giving you the performance of a workstation.

Recent comments in /f/deeplearning

References