Calculating the Sharpness of Different…

Apr 19, 2024

How well does the sharpness score agree with the human intuition about sharpness?

6 Comments

Aug 26, 2024

I have a basic question upstream of many of your articles, it may be very naive. I have seen many conversion curves by now, from lichess, to LC0, to SF (and maia). I may not have digged a lot, but I wonder each time, what they mean by centipawn axis value, being associated to certain odds.

I understand the odds already, perhaps as outcome statistics, but for me the natural lowest level association is to a game. Finally perhaps odds of player pairs might be gotten.

But I suspect in those places using either pawn as or some derived measure called positoin difficulty, I wonder how the positoins of a full game different possible centipawn are being intergrated over.

Here is an attempt. For all position in a pool of games for some set of events for OTB, or some period for online (24h/24h), one can calculate all games that visit such position (as FEN with given depth, or depth agnostic FEN, such as just EPDs), the game outcome value is part of the estimator data being intergrated by the statistics being shown on those curves. I don't know why I need to have that spelled out. maybe this is not even what it is. can you help. sorry. to barge in here.

Expand full comment

Reply (1)

Julian

Aug 28, 2024

I mostly use the odds from LC0 which the engine uses internally, so there isn't any conversion from centipawns.

As for other conversions, I think that they often work in different ways. The one SF uses is based on SF games as far as I know, so the odds don't consider human error.

Lichess uses a conversion from centipawns to win percentage which I also sometimes use. They wrote that it's calibrated for games of 2300 players, but I don't know how they came to that formula. Here is their article if you're interested: https://lichess.org/page/accuracy

Expand full comment

Reply (1)

dboing dboing

Sep 2, 2024Edited

Thanks. I think I should dig in each cases for the specificis on the position axis definition with respect to the data sample "matrix" (Idk how to refer to the whole sample in its exact form, that which would make sense to the many plots I have seen about it, I only have the understanding that makes sense for outcome odds from some A0/Lc0 point of view, but I get lost in the conversion plots, and besides going trekking in source code (which is not my cup of tea), I was seeking for some higher level understanding. It might vary in each cases: Lichess, SF (internal evaluation dynamics to WDL scale, more calibrated scale which would allow evolving SF to output the same out most score gamut, unless I am guessing this wrong too, from crumbs), LC0 compliance with UCI curves, or Maia curves.

I am fine with your answer though. as this is the wrong place to ask about it. I just thought by some chance you might have seen the same question adressed in the non-source code text available somewhere. That I might have missed it. But one does not need to know that, to do the other constructs that you are doing, and that I appreciate.

Expand full comment

Anlam Kuyusu

Apr 19, 2024

This ought to be a really comment to one of the previous posts but I am not sure draw percentage is an indication of sharpness.

A really imbalanced position like Q vs 3 pieces or Q vs 2 rooks could really imply a decisive result. But this doesn’t mean the positions in the game are sharp.

For me sharpness means something like walking in a minefield - unless you make the only one or two moves your position becomes lost or your winning advantage disappears.

According to this definition of sharpness, a position could really be sharp in that unless White makes an insane computer move, Black comfortably draws. Despite this, the most likely human result is a draw.

Expand full comment

Reply (1)

Julian

Apr 19, 2024

I used the term sharpness since I liked the sound of it and it's a commonly used term. Maybe saying that the score measures how complicated a position is might be a bit more accurate, but then the question comes up where the exact difference between a sharp and a complicated position lies.

But the situation you described of a position where both sides have to player very accurately should have a very high sharpness score compared to a strategically complex position since LC0's draw rate will be much higher when a position is strategically complicated, but not very tactical.

In the end, I try to capture human concepts with chess engines, so there will always be a lot of room for discussions about how accurately these numbers describe a specific human concept. Talking about the complexity of a position might come with its own problems. One might argue that the starting position is very complex since every side has many different options and a tactical position with a narrow path for both sides isn't as complex since it only requires accurate calculation.

Expand full comment

Reply (1)

Anlam Kuyusu

Apr 19, 2024Edited

"But the situation you described"

Which situation are you talking about? If you are talking about the case where White has to make an inhuman computer move or it's a draw, then I think LC0 will classify it as probably a draw.

Take the recent Nakamura-Nepo game from candidates. Nakamura played a novelty of sorts in the Petroff defense and caught Nepo off guard. For several moves, Nepo had to play only one or two accurate moves or else lose. He did and the game ended in a draw. It was a sharp struggle AFAIK but it ended in a draw.

Regardless, I think "minefield" (for lack of a better word and to avoid using the word sharp) positions where each side has to make one or two accurate moves or else lose their advantage or the game usually leads to wins or losses. So for most cases the two interpretations should be the same.

And I may not have a full grasp of LC0's win-draw-lose probabilities - I'm considering these from a more human perspective (i.e. what would a human think the chance of winning/losing/drawing is).

Expand full comment

Chess Engine Lab

Calculating the Sharpness of Different…