I understand the automatic goal as a way to figure out the feature space of games, you have augmented the usual move ceiling based only dimension inherited from engine pool ELO inferred back on any position of these ELO award winning tools game outcome populations, to the time control dimension (but how much of that is board chess), so not an increase of information about the core game made of position sequences. You have augmented it with the dimension of sharpness, as previously defined using more information about game continuations than the mere best move (or moves, but all best).
I think that we already are missing a lot of automatic analyses dimensions at the mere position level, then position pairs.. And then well whole games....
But you are asking the questions. There might be more information to include that would be valid at many games levels that could shed light automatically on any single game at some point. I think the territory from a one dimensional rating measure should be rather vast.
That one can't yet make sense on one new dimension like average sharpness change clocked on the ply. I noted that it was taken as a differential average (slope of sharpness at each pair of consecutive positions) over the sequence of positions. So perhaps human behavior is a bit volatile within the duration of the game, and there might be some windows average along the game in between the per position sharpness and then this whole game average. Sorry, that one can't yet, ... yadi yada... should be not disappointing. Perhaps we need to find more dimensions, but we could also dig into the definition/interpreation of shaprpness from its defining constituants.
But I wonder about human saturated high quality of chess relationship to the foundation of the sharpness definition. While coming from leela more generalist learner static evaluation function that in spite of the dilmemna of reinforcement learning might still retain a less tunnel vision chess position world view, it had to compromise on exploration as part of its glorious trajectory.
It is not clear to me what that RL compromise become quantitatively when they stop the engine learning improvement. About its whole combined RL batches set of games from when it was 100% generalist and legal chess uniform policy bias to whatever its satisfying expert level was. What is left in its static evaluation of the initial full exploration birth. (Uniform bias = no bias).
What does it mean, the WDL triplet (or pair of values) for a position? This is about the continuations from those position given a certain "tightness" of play, on either side (basically assumed the same, but which?). Perhaps being in the fog of that RL 2 prong pressure (exploration and exploitation, i.e. expertise gathering), one might use engine outside of the ELO optimal conditions. Go back to the stats of NN along their learning trajectories, and have their WDL points of view... Or force SF to the maximal of its multiple PV trhoughput.
But I think RL dilemma might also apply to human. The E. effect. At some point on makes some narrowing choices where their small (but famous and best of us) brains need to perform.. etc... and being single human life span (and small brain as all of us), well expertise comes at a cost. They might be threading always on sharper continuations than whatever mitigated LC0 many batches resultant evaluation function might still represent. I might be still not clear on interpretation of sharpness as defined previously, or on the definition itself. I mean my problem of understanding, not a critique at all.
I am perhaps missing a point. And I don't know where that leads. Just asking. Please do point out my possible misunderstandings or misconceptions I might have uttered.
Could you make an Accuracy distribution graph but cumulative instead?
I understand the automatic goal as a way to figure out the feature space of games, you have augmented the usual move ceiling based only dimension inherited from engine pool ELO inferred back on any position of these ELO award winning tools game outcome populations, to the time control dimension (but how much of that is board chess), so not an increase of information about the core game made of position sequences. You have augmented it with the dimension of sharpness, as previously defined using more information about game continuations than the mere best move (or moves, but all best).
I think that we already are missing a lot of automatic analyses dimensions at the mere position level, then position pairs.. And then well whole games....
But you are asking the questions. There might be more information to include that would be valid at many games levels that could shed light automatically on any single game at some point. I think the territory from a one dimensional rating measure should be rather vast.
That one can't yet make sense on one new dimension like average sharpness change clocked on the ply. I noted that it was taken as a differential average (slope of sharpness at each pair of consecutive positions) over the sequence of positions. So perhaps human behavior is a bit volatile within the duration of the game, and there might be some windows average along the game in between the per position sharpness and then this whole game average. Sorry, that one can't yet, ... yadi yada... should be not disappointing. Perhaps we need to find more dimensions, but we could also dig into the definition/interpreation of shaprpness from its defining constituants.
But I wonder about human saturated high quality of chess relationship to the foundation of the sharpness definition. While coming from leela more generalist learner static evaluation function that in spite of the dilmemna of reinforcement learning might still retain a less tunnel vision chess position world view, it had to compromise on exploration as part of its glorious trajectory.
It is not clear to me what that RL compromise become quantitatively when they stop the engine learning improvement. About its whole combined RL batches set of games from when it was 100% generalist and legal chess uniform policy bias to whatever its satisfying expert level was. What is left in its static evaluation of the initial full exploration birth. (Uniform bias = no bias).
What does it mean, the WDL triplet (or pair of values) for a position? This is about the continuations from those position given a certain "tightness" of play, on either side (basically assumed the same, but which?). Perhaps being in the fog of that RL 2 prong pressure (exploration and exploitation, i.e. expertise gathering), one might use engine outside of the ELO optimal conditions. Go back to the stats of NN along their learning trajectories, and have their WDL points of view... Or force SF to the maximal of its multiple PV trhoughput.
But I think RL dilemma might also apply to human. The E. effect. At some point on makes some narrowing choices where their small (but famous and best of us) brains need to perform.. etc... and being single human life span (and small brain as all of us), well expertise comes at a cost. They might be threading always on sharper continuations than whatever mitigated LC0 many batches resultant evaluation function might still represent. I might be still not clear on interpretation of sharpness as defined previously, or on the definition itself. I mean my problem of understanding, not a critique at all.
I am perhaps missing a point. And I don't know where that leads. Just asking. Please do point out my possible misunderstandings or misconceptions I might have uttered.