Building better player projections
Before we get started, just a quick update to my last post (http://mlblogsfantasy411.files.wordpress.com/2009/12/broxton3.jpgarchives/2009/12/player_valuation_and_sgp.html)… I provided the average stats for each first and last place NFBC team, minus the top and bottom five outliers, but I only used 10 teams to compile the averages rather than 21 (the 26 NFBC leagues minus the five outliers). So, here are the revised numbers and the SGP in each category:
RANK AVG AB R H HR RBI SB W SV ERA IP WHIP IP SO
1ST .2857 7543 1145 2155 303 1129 199 107 100 3.583 1516.2 1.245 1450.4 1374
15TH .2607 6985 905 1821 202 869 94 67 19 4.726 1406.6 1.438 1387.9 950
SGP .0017 37.2 16.0 22.2 6.8 17.3 7.0 2.6 5.5 -0.076 7.3 -0.013 4.2 28.3
(Sorry for the tiny font, that’s the only way I could get the whole table to fit.)
Not a huge difference obviously, but hey, I’m the stats guy, I’m supposed to get this stuff right!
Now, on to this week’s topic: how to build useful player projections. The reason we need to figure this out now is that while raw SGP can be determined on a per-category basis, meaningful SGP must be determined on a per-position basis, otherwise there will be no reflection of position scarcity in your player valuations. And to determine position scarcity or any type of pre-draft player valuation, you must develop a ranking system, and the most objective way to do that is by building projections.
There have been thousands of words spilled all across the Internet on the value of player projections and the need for them: some think they are critical, some think they are useful but impossible to build with any meaningful accuracy, and some think they’re a total waste of time. Baseball HQ chief Ron Shandler – one of the giants of the industry – wrote a lengthy and informative piece last year on “The Great Myths of Predictive Accuracy” (http://www.baseballhq.com/books/myths.shtml), which said in essence that even the best projection systems will typically be wrong to some extent or another on every player, so fantasy players must use all projections with a certain amount of skepticism. I hope I’m not incorrectly paraphrasing what Ron wrote, and I encourage everyone to read this excellent work.
In any case, I do agree with Ron that it’s unimportant whether your projection for Albert Pujols (right) calls for him to 39 homers or 42; either number would likely be part of what we would consider an “expected” season from El Hombre. But I think pre-draft projections are critical for two key reasons:
* Establishing relative values of players before the draft to help in ranking them, and
* Evaluating relative strengths and weaknesses of teams as the draft progresses
If your projections are objective and reasonable, then they should be useful for both of those purposes. So how do we make them objective and reasonable?
First, consider the source… there are numerous projections available on the Internet, some for pay and some for free, and while PECOTA from Baseball Prospectus has long been considered the cream of the crop, the field has been catching up in recent years. In fact, noted sabermetrician Tom Tango conducted a season-long experiment this past season (http://tangotiger.net/forecast) which concluded what many have thought to be true: the “wisdom of the crowds” approach ultimately generates results just as good as any other system. So here’s what I do:
1. Gather as many different projections as I can that pass the “sanity check” – no batters projected to hit 60 homers, no pitchers with 325 strikeouts, etc. – and average them all up. In recent seasons that’s been somewhere between seven and nine different sets of projections but use as many as you think are reasonable, to get the widest range of opinions and inputs.
2. Adjust properly for playing time. Last season in MLB there were about 187,000 plate appearances and 43,000 innings pitched… make sure your projections add up to a reasonable estimation of those numbers. Then, make sure the playing time is distributed properly… each team should have somewhere around 6,200 plate appearances and 1,440 innings, with playing time distributed between obvious regulars and reserves. Obviously there’s no way to predict who will ultimately get every at-bat over the course of the season, but the key is to make sure playing time is appropriately distributed among the players who are likely to be selected in your draft.
Make sure to factor in health risk and job security as well. Projecting Nick Johnson for 550 at-bats is sure to result in disappointment, but at the same time, if you only have Derek Jeter (below) with that many you’re ignoring recent history. All players are subject to unexpected injuries, but as our friend Will Carroll would say, “health is a skill.” If a player has demonstrated an ability (or inability) to stay in the lineup, adapt accordingly.
3. Once playing time is adjusted properly, pro-rate all projected counting stats accordingly… if the projection calls for 400 at-bats but you bump it up to 500, add on another 25 percent in each of the other categories.
4. Next up, I calculate projected RBI’s based on projected extra-base hits. The formula I use, developed by a smart guy named Mark Padden, awards about seven-tenths of an RBI for each “weighted” extra-base hit, weighted as follows: singles * .2, doubles and triples * 1 and homers * 3. So a player projected for 100 singles, 30 doubles, 5 triples and 20 homers would be projected for about 80 RBI’s. Then I tweak the RBI numbers up or down to correspond to likely batting order position… if I know Hanley Ramirez is dropping down to the third spot in the order I’ll bump up his total but then shave down those who are moving away from the heart of the order.
5. Make sure runs and RBI’s correlate properly. Last season, roughly 95 percent of all runs scored were generated by an RBI. Add up the runs and RBI’s in your projections, then divide RBI’s by runs… if you’re not somewhere between 94 and 96 percent on a per-team basis, you’ll need to tune your run and/or RBI projections up or down to match.
6. When it comes to closers, keep it simple. I project each of the likely full-time closers with three wins; there are always the exceptions who will win seven or eight or 10 games each year, but those are hard to predict, so don’t bother. Tack on a bonus win or two if you really need to, but trying to predict whether Jonathan Broxton (below) will win four or eight games next year is a futile exercise. I distribute saves in round numbers, too… clear-cut top-shelf closers get 38-42 each, the second tier get 32-36, and so on.
7. Keep it simple for starters, too. Strikeout and walk rates are fairly projectable, but as we know, wins are typically not. So I give aces 17-18 wins, solid mid- to upper-tier guys 14-16, and so on.
8. Finally, in the same vein, don’t get carried away on steals. That’s a very volatile category based as much on opportunity and managerial tendencies as it is on pure speed. Guys like Carl Crawford and Jose Reyes are always going to get their 50-60 steals as long as they’re healthy, but look at how the SB’s have fluctuated for an otherwise consistent player like the aforementioned Jeter: 30-11-15-34-14-23-11-32 over the last eight years.
That’s the process in a nutshell. I may be forgetting something here but hey, I can’t give away ALL of my secrets!
Until next week, Merry Christmas and happy holidays!