Baseball Betting System: Pythagorean Expectations
by Robert Ferringo - 04/01/2009
Pythagoras is a famous Greek philosopher (can someone name me a non-mythical famous Greek who wasn't some kind of philosopher?) and mathematician. He is most famous for his triangle theory but he turned out to be a pretty bitchin' dude and he believed that numbers were the foundation of life and the ultimate reality. He believed that through math "everything could be predicted and measured in rhythmic patterns or cycles." In other words, he was kind of a latter day Bill James.
So I suppose its not a surprise to learn that Bill James, a baseball writer, historian and statistician who basically invented the field of sabermetrics in baseball, concocted a formula for measuring the expected number of wins for any given Major League Baseball team. This metric attempts to quantify how "lucky" a particular team is by comparing the number of games a team "should" have won (based on the formula) to the amount of games that team actually won. James dubbed his formula the Pythagorean Expectation because of its similarity to Pythagoras' theorem.
Here is the actual formula:
Basically, James simplifies the game into its most basic components - runs scored and runs allowed - and used that more to explain how well or how poorly teams performed. At first the correlation between James' formula and the actual winning percentage of teams was a mere experiment. But other statisticians and probability theorists (whom I presume are also fans of America's Former Pastime) were able to determine that the formula can give a probability for future wins.
I know, I know: reading about this stuff is enough to make your eyes roll back into your head. But this year I factored in the Pythagorean Expectations into my MLB futures bets and I'm very interested to see the results for this baseball betting system. If they are anything close to what I've observed from the last several years then I think we may have found a significant moneymaker for years to come.
The summation of this formula can actually be found right on the MLB.com website. In the league standings you have the option of viewing the "X-Wins", which is the measure of a team's Pythagorean Expectations, or their "expected wins."
Using that number I went back to track the teams with the biggest discrepancies from year to year, dating back to 2001. If a team had an actual win total of 89 and an X-Wins total of just 84 then it's safe to say that our team overachieved by five wins (+5). The reverse is also true. If a team finished the year with 72 wins but had an X-Wins total of 79 then that club underachieved by seven wins (-7).
Arbitrarily, I determined that 4.0 would be the threshold for significance for determining if a team over- or underachieved. And since 2001 a total of 76 teams, out of a possible 240, either over- or underachieved according to our threshold. So what I did was to group the teams that overachieved and the ones that underachieved and to compare each individual club's results the following year. However, while 4.0 was the initial threshold I found that if you bumped that up 5.0 you actual returned more significant wagering results.
Teams that were +5 or higher according to the Pythagorean Expectations - that means that the difference between their Actual Wins and their X-Wins were 5.0 or higher - actually had fewer wins the following season in 22 of the 32 instances. And it makes sense because if a team overachieves one season then it's most likely going to experience a natural statistical regression and come up short the following season.
Our underachievers - teams with a -5 or below Expectation - also went the opposite way and they ended up winning more total games in 23 of the 35 occurrences over the last eight years.
Now, using the formula and my generic threshold significance of +/-4 I may have found a consistent moneymaker when considering futures bets. I started by comparing the 2005 results to the 2006 season wins totals released by Las Vegas Sports Consultants. I wanted to see how the teams that overachieved and underachieved in 2005 did against the 2006 Vegas wins totals, and I then repeated the exercise for each of the following years. The results were eye opening.
Teams that we have deemed "overachievers", that is teams with a differential of +4 or higher went just 6-9-1 (40 percent) against the Vegas wins totals the following year. If we bump our threshold up to +5 we actually lost a few percentage points as those teams were just 5-7-1 (41.6 percent) against the Vegas wins. And placing a futures bet against these overachievers the year after turned a profit all three years.
The results were even better when you bet on teams to bounce back after an "underachieving" year of -4 or less. Those teams went an outstanding 11-7 (61.1 percent) in the last three years against the Vegas wins totals. They would have had an overall losing season in 2007, going just 3-4, but a 5-2 year in 2006 and a 3-1 year in 2008 would have made this system an overall winner.
And the results for the underachievers get even better if you raise the threshold to -5 or less. Those teams went a sensational 9-2 against the Vegas wins totals over the last three years and produced a profit each individual season.
Overall, betting against ('under') the season wins totals of our overachieving teams (+4 or higher) and betting on ('over') the season wins totals of our underachieving teams (-4 or worse) would have gone 20-13-1 against Vegas season wins totals over the last three years. And if we narrow our underachievers to -5 or worse that raises our record to 18-8-1 and gives us a sensational 69.2 winning percentage!
This year we have only two qualifiers that were overachievers last year. Tampa Bay's Actual Wins total was 97 last year their Pythagorean Expectations (or X-Wins) was just 91. That gives them a +6 and means that we should bet the 'under' against their season wins total this year, which is currently 89.0. Houston (+9) also fits that category and their 2009 season wins total is currently at 74.5.
Those two teams - Tampa Bay and Houston - represent the smallest number of "overachievers" to bet on in the last four years.
Our pool of underachievers is a bit broader. Toronto (-7), Atlanta (-7), Baltimore (-5) and San Diego (-5) should all be teams that we play 'over' against their season wins totals. Also, Cleveland (-4) and Detroit (-4) fit in the original threshold and are worth a look even though they don't pass the -5 filter.
You really don't need to fully understand the math behind this. You don't need to understand Pythagorean philosophy, Bill James' sabermetrics, or anything else. All you have to understand for this baseball betting system is this: had you bet $1,000 on each of these season win totals you would have been up a cool $9,200 and all it would take is the MLB.com site, a calculator, and about 15 minutes of work.