As pointed out in this Hardballtimes story by Matt Hunter, here’s two opening-day pitching lines:
Matt Harrison: 5.2 IP, 6 R, 5 ER, 9 K, 3 BB, 0 HR, 0.2 WAR
Stephen Strasburg: 7 IP, 0 R, 0 ER, 3 K, 0 BB, 0 HR, 0.2 WAR
Wait. Strasburg’s 7 shutout innings in which he only allowed 3 base runners is somehow considered an “equal” performance in terms of FanGraphs WAR to Harrison’s 5 2/3 inning 5 earned run debacle??
The reason why (as explained much better in the Hunter link) is because Fangraph’s WAR is based on FIP, and despite Harrison’s line and despite the fact that Strasburg didn’t give up any runs Harrison’s FIP was actually lower than Strasburg’s for the day. FIP only measures Ks, BBs and Homers, and because Harrison had many more K’s on the day his FIP is better.
Here’s my problem; how can you possibly trust a statistic that is this blatantly wrong on an individual game level? Both WAR and FIP accumulate over the course of a season to arrive at a measure for a player’s performance, yet clearly they both have significant individual-game issues. And as Hunter points out (paraphrased), “if you can’t trust a stat on a per-game basis, you can’t *really* trust the stat on a full season level.”
I point this out because there are far, far too many stat-heavy baseball writers out there who will literally call you an idiot if you dare use “old time” statistics to measure a player’s season … but who also use the likes of WAR and FIP as the be all-end all replacements. And that’s where I have a problem.
And all of this is to say nothing of the heavy reliance of defensive stats on WAR, defensive stats which didn’t exist 10 years ago (so how “good” or “bad” are our historical players?) and defensive stats which are admittedly flawed when it comes to doing what they’re supposed to do unless every player stands in exactly the same spot at every position on every play all year? If your team employs lots of infield shifts (like say a Tampa Bay), guess what? Your UZR rating looks fantastic. If you play in a big pitcher’s park and have a fly-ball pitcher on the mound (think San Francisco and Matt Cain), your UZR looks awesome as you chase down lazy flyball after lazy flyball. Defensive stats can’t take into account first basemen digging out throws or measure nearly any component of catching defense outside of the basic counting stats we already had (errors, caught stealing, passed balls).
I don’t know what the solution is. But I know it isn’t to claim that WAR is the ultimate player measurement stat that lots of people believe it to be.
I never use the WAR stat. Its too coarse … not fine grained enough.
peric
4 Apr 13 at 2:24 pm
I’ve only recently got into baseball fandom in the past year and have a casual interest in the analytic side, even though I have a hard time wrapping my head around the math. To WAR represents an easy way to abstract a players skill set, but I agree that it should the only think you look at.
That said, just because one component of it may be a little broke doesn’t make it worthless, it just needs fixing.
James F
4 Apr 13 at 2:33 pm
See, but that’s the point. There are a lot of writers out there who use WAR as a be-all end all. Nobody acknowledges that it has serious side effects. But how do you “fix” what this guy pointed out? FIP is a huge component of the fangraphs version of WAR clearly; what else is out there?
I can say that nearly any stat has its plusses and minuses, even the ones that stat-nerds can’t stand. Wins, RBIs, Saves. There are lots of negatives to those stats but they’re not entirely useless either.
Todd Boss
4 Apr 13 at 3:01 pm
Didn’t Voros McKrakken (or whatever his name is) show that a pitcher can’t control where the ball goes off the bat, and therefore that a pitcher’s K/BB/HR rate reflect a truer measure of their talent than actual results?
James
5 Apr 13 at 5:45 am
I think the statement “a pitcher cannot control where the ball goes off the bat” is more about a pitcher’s BABIP than about FIP. But even then there’s absolutely guys with far lower career BABIPs than the accepted average of .290-.300 (Mariano Rivera: career BABIP .262). So why is Rivera’s BABIP so low?
Likewise, I think just looking at the 3-true outcomes (which FIP does) significantly undervalues a number of guys who get a ton of groundball easy outs. Its the story of the example posted here; Strasburg didn’t get 10 Ks but he got a bunch of weak outs so his FIP looks worse. If I’m a pitcher and I know I can induce a bunch of bat-breaking weak grounders, isn’t that better than the work required to get a bunch of Ks? I can stay in the game longer (Strasburg was at 80 pitches through 7; easily within Complete Game territory) and have a better chance of controling the outcome of the game.
Todd Boss
5 Apr 13 at 9:45 am