Probability Calibration

Most people are systematically overconfident when assessing their own beliefs, even when they're a subject-matter expert. For example, a study in 1977 found that over 90% of college professors think they're an above-average teacher. People love to claim they're "99% sure" of something, but are wrong in those claims much more than 1% of the time.Consistent underconfidence can also be a problem for some people.

This is an MTG-themed probability calibration exercise to help you gauge your own calibration and improve your self-assessment skills. Exercises like this don't necessarily generalize to other domains, but they can help you learn a mindset that can then be applied elsewhere. Also, they're fun.

Select how many questions you'd like to see, then press "begin". Questions are simple and will take a few seconds each. The more questions you generate, the more accurate your results will be.

Show me questions

For each question, select the answer you think is most likely to be correct. Some of the questions will be about things you don't know- that's ok! Just take your best guess. Then use the slider to indicate how likely you believe your answer is to be correct.For example, if there are two possible answers and you have no idea which one is right, you'd put 50%, since that's the chance you'll get it right by randomly guessing.

Results:

Average confidence:

Percent correct:

A well-calibrated person's average confidence will be close to the percentage of questions they actually got correct. Your results indicate that you are See the chart below for details.

Average difference among everyone who has played (average confidence minus percent correct): [loading] percentage points. A positive number means that most people tend to be overconfident, a negative number means they tend to be underconfident)

Your overall score: points

This score takes into account both your calibration and your accuracy. It's calculated such that it's always your best strategy to take your best guess and then report your true confidence in that guess.What I mean by this is that it's not subject to the game type of "gaming" that the "average confidence" field is above. You could be "perfectly calibrated" by always picking the first answer and then inputting a confidence of 1 over the number of answers. (50% for two-answer questions, 33% for 3-answer ones, 25% for 4, etc.). But if you do that you'll get a score of 0. As long as you're well-calibrated, the expected value of a question is positive, so if you got a negative score, it either means that you're poorly calibrated or you got particularly unlucky.The details of this type of scoring are explained here, and the exact formula is from here. Note that the score is calculated per-question, so playing games with more questions will result in higher scores.

Average score among everyone who has played a game of this length: [loading] points

Note that if you misclicked or misread a question, your results are still accurate. When trying to evaluate your level of confidence, you should always be accounting for the possibility of having made a mistake.

Each point on the graph indicates your average accuracy on questions around a given confidence level. For example, if there's a point at [60, 70], that means that out of several questions you were around 60% confident on, you got 70% of them correct. A point within that shaded area means you were well-calibrated on questions around that confidence level. A point above the shaded area means you were underconfident on questions around that confidence level, and a point below the shaded area means that you were overconfident at that level. Variation in the height of a point by less than the height of the shaded area is likely to be random noise. Hover over a point for more information on it.

For some non-Magic calibration games, see here, here, here, or here. For a good book on the subject, check out Thinking in Bets, by Annie Duke.

Probability Calibration

Results:

Outside the Asylum