19 January, 2011

Why any statistical analysis of defense is useless

There are four basic statistics collected for fielders: Putouts, Assists, Errors, Double Plays. And from these, many rate stats can be calculated. These most common of these is Fielding Percentage ([PO+A]/[PO+A+E)) and Range Factor ([PO+A]/Games or Innings played.) These can be compared to yearly league averages for the position, and be used in many other ways, but they’re all useless.

Fielding percentage might be the most useless, and most intelligent fans know why. A guy with no range in the field gets to very few balls. He’ll make very few errors, but also very few plays. A guy with great range will get to many more balls, make many more plays, and make many more errors, by muffing plays that he usually makes look easy, but which would be well beyond the reach of lesser fielders. Thus a high fielding percentage does not necessarily mean you’re good – possibly just too cautious for your team’s own good – and a lower fielding percentage doesn’t mean you’re bad – possibly just to aggressive for your own good, stats-wise.

And this is hardly a new observation, or strictly a modern phenomenon. Check this out. It was written in 1917!

Range factor (plays per game) is also useless. And the problem has to do with the fact that Putouts and Assists mean very different things for Outfielders and Infielders. In the Outfield, a putout means you caught a fly ball. It’s strictly a measure of Range. In the infield, it also means the occasional infield fly, but far more often it comes from a guy STANDING ON THE BASE while SOMEONE ELSE THROWS HIM THE BALL. That the lions share of infield putout and they are largely a measure of nothing at all. (One's ability to stand there a catch a ball being thrown to him by a professional.) They can also slap the tag on a runner, but this is also NOT a measure of range. In the outfield, assists means the Outfielder gunned down a runner. It’s purely a measure of his arm. In the infield, it means the infielder GOT TO A BATTED BALL, and then threw it to the bag. His arm’s involved, but it’s a far greater indicator of his RANGE: To be able to throw out a lot of guys at first, you need the RANGE to get to more of those grounders, dribblers and one-hoppers. Compared to that, the number of fly-balls and line-drive caught are minuscule, and not really indicative of an infielders RANGE anyway. ANY infielder is in range of 90% of the Infield Flies hit, and catching line-drives is more a measure of reflexes than range. And Catchers? Well they’re all messed up. They get Assists for gunning down runners (Arm) but also for fielding bunts (Range). So there’s a little of both going on there, with any good Catcher. (Of course… these days we steal a LOT more than we Bunt, so Catcher is probably the one infield position where Assists are due to Arm more that Range.)

And then you’ve got double plays. To me THIS should be a good way to measure a guys arm. In the Outfield, it’s just an assist, but it one that occurs when a guy challenges the Outfielder and loses. In the infield, a Double play get credited to the every guy involved. Let consider the classic 5-4-3 and 5-6-3 Double plays: The 3rd Baseman HAS to make a GOOD THROW (Arm, OK); the middle infielder needs to turn it, making a GOOD THROW to first, typically while leaping through the air (Arm, OK), the first baseman…. Has to catch it, maybe while leaning forward. YTF should HE get credited with a double play?

Also, while an Outfielder with a GOOD arm might rack up a lot of Assists, gunning down runners foolish enough to underestimate him, an Outfielder with a GREAT Arm might not get many at all… since fewer players would be foolish enough to test him! Same goes for Catchers. Take two Catchers with a 35% Caught Stealing Rate. But one of them faces 4 attempts per game, while the other faces only 1. Same rate, but is there any difference? Why the disparity in attempt rates? Maybe that second catcher is really good, and so only the very best ever even TRY to steal on him – thus his 35% comes against only the leagues elite runners. The other guy’s closer to average, and so half the league in running on him, and HIS 35% comes against a much lower caliber of runner, on average. It’s possible that he didn’t throw out ANY of the top runners that were gunned down 35% of the time by the second Catcher.

Any way, leaving out Catchers and First Basemen, A/G and DP/G could be used to rate the Range and Arms of infielders, respectively while PO/G and A/G could be used to measure the Range and Arms of Outfielders, respectively. That makes a BIT more sense to me that (P+A)/G to measure only Range, but it’s still completely bogus, and here why…

There are only 27 outs per game. Period. If we count Pitchers’ strikeouts, that what you’ll find is that every single team averages the same number of Outs per Nine Innings: TWENTY SEVEN. (They can’t possibly have any more or any less!) And that’s important. Because in a stat like Home Runs, or Hits, there is no limit designed in. So players compete with all players in the league to see who can rack up the most. In the case of OUTS? (PO, A and K) Players are competing only with the other fielders ON THEIR OWN TEAM.

Let’s say you have a team where everyone is AVERAGE. Everyone is going along making X out per game, whatever their fair share is as average fielders. They’re also making Y errors – again, whatever their share Is, as average fielders. Then let’s say the team goes and signs a young Ozzie Smith to play shortstop. Suddenly twice as many balls hit between 2nd and 3rd are being caught, and fewer errors are being made. Many more than the average number of out is being made by the shortstop. But what happens to everyone else’s defensive stats? Remember: There are only 27 outs in a game! All those extra plays being made at Short mean fewer balls being hit everywhere else on a per nine inning (or per 27-out) basis!

Say they then go and sign Brooks Robinson to play 3rd, Roberto Alomar to Play 2nd, Keith Hernandez to play 1st, Johnny Bench to Catch, Roberto Clemente to play right, Willie Mays to play Center and add Bob Feller, Roger Clemens, Nolan Ryan, Randy Johnson and Steve Carlton to the Rotation. What do you think happens to the Left Fielder’s Range Factor? Well, it goes down, of course! WAY down! Did he get worse? No. But a much higher percentage of balls hit elsewhere else that used to go for hits are now being caught for outs. What’s more, all those strikeout t pitchers, fewer balls are even being put in play! And that means fewer chances for a ball to be hit his way. No matter how you cut it, straight up or against a league average, he looks worse, ever though he’s the same guy, playing the same way.

Let’s say instead of doing what they did above, the teams puts Dick Stuart at Fisrt, Harmon Killebrew at Third, Don Buddin at Short, Jose Offerman at 2nd, Dave Kingman in Left, Kirk Gibson in Right, Ernie Lombardi at Catcher and don’t sign any new pitchers. Suddenly, the Centerfielder now looks like a defensive superstar, at least statistically! He didn’t get any better in reality, but no one ELSE is making any plays, and remember: The game doesn’t end until 27 out are made! And with such a low percentage of hits being made everywhere else, lots more balls (on a per 27 outs basis) are now being hit to Center. Even if his actual range doesn’t increase by a single step, the Center Fielder will be making more plays PER GAME. Fans would look at his league-leading range factor and wonder, how does that happen? He’s not even that good! But there’s the rub:

The only thing Range Factor will tell you, is that if a player (say the CF) on ONE TEAM has a higher RF, compared to the league average, than another player (say the SS) on that SAME TEAM does, then that might mean that Player A is a better Centerfielder than Player B is a Shortstop. It says NOTHING AT ALL about how he compares to Centerfielders on OTHER TEAMS, because he’s not competing with any of THEM for his share of 27 outs per game. Even looking at double plays can be misleading because an Outfield of Carl Yastrzemski, Willie Mays and Roberto Clemente would mean a higher percentage of ball hit to the outfield would be caught and thus fewer ground balls will be hit on a per 27 outs basis. Thus robbing the infielders of Double Play opportunities.

And strikeout takes defensive opportunities away from EVERYONE. So that pitching staff I mentioned earlier? Would make every fielder look lazy, statistically, even that same lineup that went with them, plus Yaz. The fielders and Pitchers start to cancel out each other stats because no matter how good they are, a team simply cannot get more than 27 out per game – it a mathematical impossibility.

I don’t have the answer. I accept, and use, dWAR but I suspect that the flaws I’ve mentioned above are inherent to it as well.

No comments:

Post a Comment