More on wins added
As promised, I'm going to offer a little more detail on the Wins Added system I used this week to evaluate the first half of the baseball season.
Let me throw out some explanation.
First off, I'd consider this to be a beta version of a system I'd like to use for myself in my efforts at covering baseball. As I pointed out in my column, as much as I admire the terrific work done at sabermetric sites like baseballprospectus.com and hardballtimes.com, I often find myself in a quandary about using their metrics for writing analysis on baseball for the paper.
My primary problem, frankly, is one of space. When I'm addressing an issue, I can't use advanced concepts like VORP or WARP because they require too much explanation. Stat Guy addresses a mainstream audience. In fact, my original reason for writing it was to serve as a conduit between the hardcore statheads and the mainstream baseball fan. At the same time, I don't like to fall back on stalwarts like OPS and, certainly, batting average, etc., because I know their shortcomings all too well.
The final results are expressed in exactly the way I'd like to seem them expressed in, say, baseball-reference.com. Four columns at the end of a traditional statistical grid: hitting wins, fielding wins, pitching wins and, finally, wins added.
The underlying methodology behind the numbers will have to be tweaked, especially the fielding part. It would also be nice to introduce some kind of leverage index as well. As I say, this is the first step of developing a new framework for my own use. Eventually, I want to integrate it with my projection system, PROFITS, which has been locked away for a long time now undergoing major renovation.
HITTING WINS: Hitting Wins Added are calculated as extrapolated runs per plate appearance (XRPA) compared to league-average XRPA times ballpark factor times total plate appearances. Convert runs added or subtracted to Wins Added (10 runs = 1 win).
FIELDING WINS: Uses zone rating. Calculate number of plays made above or below league average for that position. Convert the plays made or missed into runs made or missed. Convert runs into wins. Catchers necessitate a different method. Mine needs work. For the time being, I'm using expected assists versus actual assists.
PITCHING WINS: I calculate extrapolated runs per plate appearance (XRPA) for the batters each pitcher has faced. However, I'm using fullblown DIPS theory, which I know isn't exactly right. This is something that will have to be adjusted. What this means is that instead of actual singles, doubles and triples, I'm using league-average figures, adjusted for ballpark. Homers are also adjusted for ballpark. After figuring XRPA for each pitcher, I compare it to the league average and multiply by total batters faced. This yields runs saved above or below average. Runs are converted to wins.
WINS ADDED: The sum of the other three categories.
BUT DOES IT WORK?
Best way to answer that is to say that I think it will work, once I perfect the way I manipulate my underlying data.
At this point, I'd say the system is about 85% of the way there. For example, the sum of the Royals' Wins Added is -13.81. Based on their runs scored and runs allowed totals, you'd expect it to be -12.18. Other teams have a bigger discrepancy. Some have less. There is better correlation with a team's Pythagorean record than with actual record. This is not surprising.
Drop me a line at bdoolittle@kcstar.com with suggestions & criticisms. I welcome the input.
The complete table of first-half Wins Added is here.
A note about Big 12 power rankings
First, let me acknowledge my credentials, or lack thereof, depending upon your perspective. I am first and foremost a writer. I majored in English. I spend my spare time reading Henry Miller, Hemingway, Joyce, Collette, etc. My greatest aspiration is to publish a novel and I've got a burgeoning manuscript always beckoning me in my "writing room" where I clack away on an old manual Underwood. Of course my job, or at least part of it, is to write about sports which is an occupation that I absolutely relish.
All this is preamble to admitting that I am not an engineer. I did not attend MIT. I have never been a Bill James research assistant. At the same time, I do have a profound belief in numbers and statistical analysis. I also have a certain aptitude with numbers - once upon a time, I made my living as an accountant. No formal training, it just came naturally to me. Statistics have always seemed to me to be like a conduit between myself and the teams I watch and cover. We keep track of what they do and if you know how to decode that information logically and accurately, then you can know far more about the teams than you would know just by reading games stories and catching a game now and again.
There are few things that bother me in sports media than "power rankings" which are based solely on subjective observation. In baseball, in particular, I do not believe that you can adequately differentiate between teams by watching an occasional game. If you're going to rank teams in this sport, you need to have some sort of objective criteria as your foundation.
That is all background. Here is the nut of the system.
My Big 12 ratings system attempts to rate teams based on true ability. It does not attempt to replicate the won-loss standings. A baseball team's won-loss record is a mixture of ability, execution, quality of competition and luck. That last component is more prevalent than anyone wants to admit and is also why I felt the need to establish a power ranking system in the first place.
In big-league baseball, we have known for years that the runs scored and runs allowed by a team are a more accurate measure of team strength than won-loss record. Given enough games, a team will eventually play to its run differential. However, the number of games even in a 162-game schedule is not enough for this to occur with all teams. Thus the standings at the end of the season do not reflect exactly the order of teams by run differential. They are skewed by noise such as clutch hitting and record in one-run games. And that's great - that's baseball. That's why we love it.
Still, I think it's important to have an accurate picture of the true ability of teams. At the end of the regular season, it gives us a better idea of who is likely to succeed in the postseason. After that, it gives us a better idea of who will sink or rise the next season. For the teams whose record is significantly under their power ranking (like Missouri) we can better come to grips with our disappointment: they should have won more games. For teams who outperform their power ranking (Kansas) we can have that much more appreciation for their accomplishments.
So, to summarize, the object is to rank teams by their true level of ability. And to do this, we want to use runs scored and runs allowed.
In college baseball, this is not nearly as straight-forward as it is in Major League Baseball. The difference, of course, is the wildly different levels of talent from conference-to-conference and even from team-to-team within a conference. (Some schools just don't put much emphasis on their baseball programs.)
The crux of my rankings system is that for each game a team plays, after I enter the final score, the system then adjusts for four things:
1. who was the opponent?
2. where was the game played?
3. when was the game played?
4. was it a weekend game?
If you're playing a team like, say, St. Louis University, you don't get full credit for your run differential. In fact, if you win by, say, 5-4, then the system actually registers that as a defeat. The worse your opponent is, the more you need to beat them by to get positive credit. Games against very small schools, like Nebraska Kearney, for instance, don't count in the system at all.
Home teams win more in college baseball and, especially early in the season, northern teams don't play at home nearly as often as the teams in the souths. Thus the run differential is adjusted again depending on where the game was played.
Since we want to identify teams that are going to do well in the postseason, we try to identity those schools who are improving as the season progresses and vice versa. Thus, each game a team plays has more value to the overall rating than the previous game. By the end of the season, a game played has many times more value than the first game of the season.
We all know that weekend games in college baseball are more indicative of team strength than midweek games. That is the way pitching staffs are structured. Thus, weekend games have double the value in this system of midweek games.
Once the system adjusts the runs scored/runs allowed for each game, one final question is asked. What was the margin of victory? At a certain point, runs are superfluous. Is there a real difference between a 22-1 victory and a 9-2 victory? There is some but the fact of the matter is that if you chart won-loss records and run differentials, you get a gradually rising slope until you hit somewhere between five and six runs. Beyond that, it's just overkill. Thus the system does not reward "extra" runs above a margin of victory of six.
The adjusted runs scored and allowed figures are added together and plugged into a similar formula to Bill James' Pythagorean system (runs scored squared divided by runs scored squared plus runs allowed squared). The total is then multiplied by 100, which expresses the number as a integer between 0 and 100, though I've begun carrying the number out to one decimal so that teams don't appear to be tied if they are not.
That's about it. Pretty simple.
What can skew these results? A couple of things. First, getting squashed by a really inferior team can set you back. But more than anything, a team with an extreme record in close games will fluctuate away from its actual record. That is what has occurred in the case of Kansas. Up and down its schedule, you see that Kansas wins the close ones but often loses the blowouts. Last Sunday was the Jayhawks' season in a microcosm: they lost the first game of a doubleheader to OU 17-5 but won the second game 7-5. As I mentioned, OU doesn't get full credit for that 12-run margin in the first game but KU still comes out minus-four even before you begin to make other adjustments.
Overall, I think the system paints a very accurate picture. Nebraska is the best team in the Big 12. Texas is also strong but is a little down from what it has been in the past. Oklahoma is a veteran team that is really coming on. After that, you have two distinct groups. Missouri, Baylor and OSU are all flawed teams with good talent. Missouri has underachieved significantly and the discrepancy between its won-loss record and its power ranking reflects that.
Then you have the bottom four: Kansas, Texas Tech, Kansas State, Texas A&M. Kansas just hasn't been able to avoid the blowouts consistently enough to raise its rating. Sure, the Jayhawks have a chance to finish fourth in the conference by won-loss record. That would be a great accomplishment. But the teams are so tightly-packed together can you really say that a 10-11 team is obviously better than a 9-12 team based strictly on the conference standings? This is the complaint I've been getting from KU fans.
Other fans point to the RPIs, which have Kansas in the top 40. I don't know the formula for RPI but I am almost certain that they don't use run differentials. They simply use winning percentage adjusted for strength of schedule. (Someone is welcome to correct me if I am wrong in this belief.) Besides, it is not my intent to replicate the RPI system. That system is already out there. As I don't know how it works, I don't trust it. It is important info because the NCAA uses it and that is why I list it with the power rankings. But I don't know that it does what I set out to do: Rate teams by their true level of ability.
KU has had a great season and I really hope it continues. But my system tells me that the Jayhawks have overachieved and, ultimately, I think the system will prove to be correct. In any event, I'm glad folks are paying attention to the ratings and they've obviously accomplished one thing that I wanted them to do: spark discussion.
- Brad
AFL: week 6 power rankings & picks
Through Week 5
| TEAM | SCR% | STP% | OFFe | DEFe | EFF |
| | 66.1% | 52.6% | 4.20 | 2.61 | 6.80 |
| | 67.9% | 47.4% | 4.31 | 2.32 | 6.60 |
| | 71.4% | 34.5% | 4.55 | 1.72 | 6.28 |
| | 70.9% | 33.3% | 4.53 | 1.58 | 6.25 |
| | 66.1% | 42.4% | 4.16 | 2.08 | 6.19 |
| | 69.4% | 34.4% | 4.44 | 1.80 | 6.12 |
| | 60.9% | 50.0% | 3.75 | 2.47 | 6.11 |
| | 62.5% | 39.7% | 3.92 | 1.79 | 5.76 |
| | 59.6% | 44.6% | 3.54 | 2.20 | 5.67 |
| | 61.7% | 30.0% | 3.95 | 1.46 | 5.53 |
| | 53.4% | 41.7% | 3.09 | 2.22 | 5.32 |
| | 57.6% | 35.0% | 3.53 | 1.83 | 5.31 |
| | 54.0% | 39.7% | 3.23 | 2.03 | 5.24 |
| | 54.0% | 33.3% | 3.32 | 1.76 | 5.20 |
| | 50.0% | 46.3% | 2.81 | 2.33 | 5.18 |
| | 49.1% | 46.2% | 2.64 | 2.37 | 5.03 |
| | 64.4% | 27.6% | 3.46 | 1.41 | 4.85 |
| | 47.3% | 35.1% | 2.91 | 1.73 | 4.64 |
| LEAGUE | 60.3% | 39.7% | 3.69 | 1.98 | 5.6 |
WEEK 6 PICKS
(Last week 6-3, 13-14 overall)
Friday's game
Saturday's games
Georgia 50, Austin 42
Sunday's games
Chicago 58, Dallas 56
AFL: week 5 power rankings
Through Week 4
| TEAM | SCR% | STP% | OFFe | DEFe | TOTe |
| | 69.8% | 51.1% | 4.44 | 2.49 | 6.87 |
| | 62.7% | 53.1% | 3.86 | 2.69 | 6.47 |
| | 70.5% | 40.4% | 4.41 | 2.03 | 6.39 |
| | 73.3% | 31.9% | 4.67 | 1.63 | 6.29 |
| | 67.4% | 41.3% | 4.20 | 1.98 | 6.19 |
| | 68.8% | 33.3% | 4.35 | 1.75 | 6.01 |
| | 62.2% | 46.8% | 3.74 | 2.34 | 5.95 |
| | 71.7% | 26.2% | 4.55 | 1.23 | 5.91 |
| | 55.3% | 44.0% | 3.19 | 2.34 | 5.58 |
| | 55.3% | 46.5% | 2.96 | 2.58 | 5.57 |
| | 60.8% | 38.0% | 3.81 | 1.66 | 5.53 |
| | 56.9% | 40.0% | 3.49 | 2.10 | 5.44 |
| | 58.3% | 34.7% | 3.56 | 1.74 | 5.23 |
| | 60.0% | 24.5% | 3.88 | 1.17 | 5.21 |
| | 54.0% | 32.0% | 3.32 | 1.73 | 5.13 |
| | 47.7% | 43.2% | 2.53 | 2.26 | 4.88 |
| | 48.9% | 34.0% | 3.00 | 1.69 | 4.68 |
| | 63.0% | 29.5% | 3.22 | 1.45 | 4.68 |
| LEAGUE | 61.4% | 38.6% | 3.73 | 1.94 | 5.67 |
AFL: week 4 power rankings
AFL Power Rankings
Through Week 3
| TEAM | SCR% | STP% | OFFe | DEFe | TOTe |
| | 67.6% | 51.5% | 4.31 | 2.34 | 6.65 |
| | 79.4% | 27.3% | 5.23 | 1.28 | 6.52 |
| | 61.5% | 56.8% | 3.70 | 2.80 | 6.50 |
| | 75.8% | 32.4% | 4.78 | 1.70 | 6.48 |
| | 70.6% | 44.1% | 4.37 | 2.09 | 6.46 |
| | 71.1% | 34.2% | 4.58 | 1.84 | 6.42 |
| | 67.6% | 31.4% | 4.50 | 1.54 | 6.05 |
| | 55.9% | 48.6% | 3.17 | 2.37 | 5.53 |
| | 61.1% | 40.5% | 3.81 | 1.66 | 5.47 |
| | 60.0% | 36.1% | 3.62 | 1.79 | 5.40 |
| | 54.5% | 45.5% | 3.06 | 2.26 | 5.32 |
| | 59.5% | 28.2% | 3.68 | 1.56 | 5.25 |
| | 60.5% | 26.3% | 3.96 | 1.28 | 5.24 |
| | 52.6% | 37.8% | 3.22 | 2.01 | 5.22 |
| | 50.0% | 45.7% | 2.85 | 2.35 | 5.20 |
| | 52.8% | 37.5% | 3.07 | 2.10 | 5.17 |
| | 61.8% | 36.4% | 3.08 | 1.72 | 4.80 |
| | 42.9% | 32.4% | 2.62 | 1.58 | 4.20 |
| LEAGUE | 61.3% | 38.7% | 3.75 | 1.90 | 5.65 |
