Concepts Discovered by AlphaZero
Looking at a paper from DeepMind about concept discovery and transfer
The most fascinating part for me about neural network engines like AlphaZero and Leela Chess Zero is that they learn chess from scratch. This means that they don't have human biases and might "see" chess very differently then we do.
So after I saw that researchers from Google's DeepMind have published a paper last year titled "Bridging the Human–AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero", I had to take a closer look at it.
As the title suggests, the authors first extract chess concepts from AlphaZero. Then they search for positions where certain concepts are present and try to teach them to grandmasters. They worked with grandmasters Vladimir Kramnik, Dommaraju Gukesh, Hou Yifan, and Maxime Vachier-Lagrave.
Concept Discovery
The first step is to figure out when a concept is present in a position. To do this, the authors used two datasets of positions for each concept, one with positions where the concept is present and the other one with positions where the concept is absent. The concepts they used were the presence of a certain piece types, the concepts built into Stockfish (like open files and space), concepts from the Strategic Test Suite and different openings.
Using these datasets, they then extracted vectors that represent the concept in the neural network of AlphaZero. After they had a representation for existing human concepts, the authors extracted concepts from AlphaZero which were new.
To do this, they looked at games played by AlphaZero and chose positions where an older version of AlphaZero (rated 75 Elo points lower) chose a different move than the current version. This was done in order to ensure that the concepts are learned at a late stage in training and therefore more complex.
To understand the new concepts from AlphaZero's games, they built a graph to see how the concept relate to the human understandable concepts. As an example, take a look at the following two positions in which the same concept occurs:
The authors show a part from the concept graph to see how this concept is related to human understandable concepts:
The green concepts are from the Strategic Test Suite and the purple concepts are from Stockfish. The AlphaZero concept from the positions above have a strong relationship with the recapture concept and the space advantage concept (for white).
Finally they filtered the concepts to only include ones that were deemed to be teachable and novel.
Teaching the Concepts
To see if the concepts are teachable to the grandmasters, the authors first sent the grandmasters positions where they were asked for their thoughts and had to give the move they would play. Then the GMs got the solutions from AlphaZero, which contained the mainline and a few variations that were relevant. Finally, the players got more positions with similar concepts and their performance was compared to the first set of positions. In total, each player saw 36 to 48 chess puzzles.
Each player performed better on the second set of positions, which might suggest that they learned the concepts from AlphaZero.
However, I'm unsure how much actual learning of the concepts occurred. Take the following example provided in the paper (which I already showed above):
Here, 9.Be3 would be a natural move, but AlphaZero wanted to play 9.Bg5 to induce the weakening move 9...h6. After 9.Be3 Nxf3+ 10.Qxf3 Nh5 followed by 11...f5, Black has a good position, but the inclusion of 9...h6 makes the f5-advance less strong. Another justification for the inclusion of 9.Bg5 is the following line:
9.Bg5 h6 10.Be3 O-O 11.Nxd4 exd4 12.Qxd4 Ng4 13.hxg4!!
Compare this to the same variation without the inclusion of Bg5 and h6:
There is only a small difference in the position, but White has the advantage with the pawn on h6, whereas Black is much better with the pawn on h7.
I think that humans often know a concept and spot it in a position, but might not play the move because they miss a critical move somewhere in the variations. Also weighing up different ideas is often more difficult than finding the ideas in the first place. So the reason why the grandmasters missed a certain move might have more to do with their lower calculation ability compared to AlphaZero rather than the knowledge of the concepts.
Conclusion
The paper was very fascinating to me. Especially the concept discovery was interesting and in my mind opens many possibilities. It would be amazing if engines could identify the various concepts in a given position. This would help with classifying positions and makes it easier to find the strengths and weaknesses of a player. Moreover, one could also work on their weaknesses more easily since they can quickly find positions with the concepts present and study them.
The authors also included some more examples in the paper, some of the positions and variations are simply amazing. Let me know what you think about the paper.
I find the boards to have better proportions here. Not significant comment. Thanks for your work and choices of subjects. This is long haul for me. I have other papers from them, that I need to read, too.
Calculatin ability of A0? I read bacward. did conclusion. agreed. now going up.
so. I also wonder about the difference. But calculation, I am not sure it is where the machine not having limitations. My inkling was more in the very evualation probabliity function training (or its policy if still needing that, beyond one node policy, to be beat other engines, that is). That we can't make such long term association statistics, perhaps, with such memory and exactitiude. Although now I wonder about what our subsconsious brain does capture, over different time controls about all the positions and outcomes of many games. That we don,t consciously have recalling memory of them, does that mean our less conscious observer that such ML expeirements are modeling are leaky too.
which might be what you meant about calculation abitily. The statistics from outcome to whole games mamy positoins over many games. That we conscious need to build chunks surrogate plan stepping stones is not necessariyly meaning that our statistical brain is not doing similar long term assocation.
now again having coutner thought. sorry. i think through self debate, i guess ok. if i warn..
Now, even if A0 in in learning of all its layers, and transformation needs from the input layer, which we are ourselves also learning from, even with a bad tranformation of the position for deeper layers feature processing all the way to the last dense layer than can't tranfrom much but which can discrimiate from a well transformed input (my narrative of ideal NN learning in such convolution motifs (not even worrying about which activation function or intermotif cleverness, as ignorant). A certain division of labor (there are papers about such classes of NN and certain tasks that might support this story). My point is that we might have more fading short term memory about the specifics of positions, on top our that long term learning problem (games length assocations), so my point about not knowing subsconscious abilites about whole games range of associations to learn, might be "trumped" by if the game was a blur, our subconscious might not fix the blur, so we might need a lot more training. Which might be in the line of your comment.