In the figure which follows each dot represents a game of a particular sport.
The x-axis represents the home team's margin of victory, and so around the top-right we can see a game where a home team won by almost 150 points. At the bottom left there's a game where a home team lost by about 140 points.
The y-axis represents the home team's pre-game probability of victory, as judged by their bookmaker odds before the game started (or rather by a model applied to the those bookmaker odds).
I'm interested in the question of what is a 'par' victory score, given a team with a particular probability of victory. Ultimately I want to add those to an Elo rating system where teams can be rewarded for achieving a margin of victory above par (based on the respective ratings of the two teams), and punished for achieving a margin of victory below par.
To that end I fitted a logistic curve in SPSS, and it's the black line in the figure y = 1 / ( 1 + 0.825 * 0.981^x ).
The black curve looks vaguely plausible but there are some things that bother me about it.
The black curve doesn't cut through y=0. Indeed, when the home team's probability of winning is 0.55 its margin of victory is 0. Intuitively, it would seem that when a team's probability of winning is estimated at 0.5 its 'par' margin of victory should be 0.
At the far-left and far-right the black curve indicates that (for example) a team with 0.95 probability of victory should be expected to win by about 150 points. Because I have some prior knowledge of this game, I think this is unrealistic and that a team would need more like a .999 probability of victory before a victory of 150 points would be 'par' for them.
In red I've drawn by hand what I regard as a more appropriate curve. I also wondered whether curve fitting was appropriate in the first place, given my goal of generating a par score for a team with a given win probability. I also drew a straight line in green. Again, it was done just from intuition but it seems like it would approximate simple regression line if we used probability to predict margin of victory.
I realize that given I have fairly strong preconceptions about where the curve should be, it might be asked why I don't make my own curve without any consideration of these data points. Indeed, it seems to me that many people doing this sort of modelling do ignore all empirical data and do just that. However, I am hoping to find some compromise between prior beliefs and the data.
My question is what I should do differently to obtain a curve that indicates a reasonable 'par' score for a team with a given probability of victory. As Glen_b notes here, there are many other sorts of curves that 'level off' and I picked logistic purely because it was an option in SPSS. I'm not wedded to doing this in SPSS and could use MATLAB or R instead if it was more appropriate.