View Single Post
Old 10-15-2017, 09:18 AM
fromalabama's Avatar
fromalabama fromalabama is offline
Junior Member
Join Date: Dec 2014
Posts: 7

Originally Posted by uigrad View Post
Yeah, if you pick a single puzzle size and a single puzzle difficulty, there are probably almost 1000 puzzles there, so getting the exact same puzzle again requires a lot of time playing.

There are 70-90 different "themes". I tried recording them all a while back, but no one seemed that interested in my results, so I gave up on it. It seems that there are at least 10 puzzles of each theme for each combination of difficulty and size.

If you play the 4x5 easy puzzles a lot, you've probably seen me record the medians in the comments. The medians are extremely important to me, because the score you get depends on just three variables, your time, the best time, and the median time. If the best time on a puzzle suddenly changes from 120 to 80 but the median remains mostly unchanged, then all the people who play that puzzle after that point will receive substantially fewer points. In general, we can expect it harder to get points now compared to the first couple months after new puzzles are introduced, simply because the top score changes dramatically over time, and the median hardly changes at all.

There are some individual puzzles that I have recorded the median 3 or 4 times for that puzzle, and I've noticed very little change. Over 3 months, it's rare for the median to change by more than about 10 seconds, but it does happen. I think the largest change in median after the puzzle had enough scores to show the median was about 40 seconds. You should play some of the 4x5 easy puzzles if you are interested in this.

As far as what median means, we all know the textbook definition, but I'm not convinced that this is what we actually get. One reason that I feel this is way is that I realize that to show updated median statistics, the database would need to keep track of EVERY single result. A lot of storage space could be saved by showing the mean instead. You would just keep the total amount of plays for that puzzle and the total amount of time spent on them. Of course the downside of using the mean is that a few outliers (like people who had 100 errors corrected on a single puzzle) would drive the mean to really bogus values.

Another reason I think the median might be bogus is case #1 in my image (back earlier in this discussion). If you notice there, the previous best time is almost the same as the median. In case #2, my new high score is actually slower than the median!! The only explanations I can see are: Case #1, all the people playing this puzzle just happened to have times that are clustered together, and case #2, there isn't enough recorded data for good stats, so purely fictional numbers are given. Even if these explanations are correct, I feel that there has to be some reason for the purely fictional numbers. If it was a round number and was always the same, then I would just assume that it is a guess from the admin about what the median should be. But, that fictional number does change, so there is clearly some hidden logic behind it.

My current assumption is that the median is always somewhat ficticious. The best time is kept (obviously), and a certain number of results are kept (possibly the last 10). If there are less than 10 results, then the database is pre-seeded with some ficticious times to give something to the first few people who play the puzzle. The median doesn't change very rapidly, so possibly it is moved by comparing how many of the last 10 plays were above the median and how many were below. If it is ever a 70-30 split, then it is bumped a second or two in one direction or the other.

If every result is saved in the database, then I would really like to see admin get rid of the fake bell-curve graph, and instead give us a box and whisker type of chart. In fact, I think the score should probably be based on your time vs. the 25% and 50% percentiles, instead of your time vs. the 0% and 50% percentiles.

Maybe I care way too much about this. Maybe I should just make my own puzzle site, haha.
Uigrad, I too doubt that they record every single score, which means that we're probably not getting a true median. I'm not sure how many scores they do keep, but I would expect more than ten. Maybe 64 or some other power of two. I think they could make a reasonable extrapolation by recording enough true scores to set the quartiles, and then just making a 1s adjustment up or down if a score falls outside of that range and ignoring any result that falls within it. Or they might use one or two standard deviations instead, and the 'nudge' might be more or less than one second, possibly related to the size of the puzzle. It wouldn't give you a true median but it might come reasonably close. I am quite certain at this point that they eventually discard old data. I used to have four unsolved 3x4 puzzles on my record from my internet connection crashing and I preferred dropping my solving ratio more than the penalty on average time. Recently, those records were expunged and it now says I have 100% solving rate for 3x4s. My guess is that they drop off after perhaps 4096 (2^12) records have been saved, because I'm nearing 5000 puzzles for that category and the offenders were from a long while ago. Maybe the number isn't based on powers of two, it could be 4000 games or something like that instead, but it does imply that they put a limit on the data stored for any given player and probably for any given puzzle as well, which supports the 'false' or 'approximate' median hypotheses.
Reply With Quote