Athenæum

Sabermetrics: Which Stat is Most Important?
Apologies to awiggins
from ESPN

Yes, we know. When you talk about statistics, you can't find one more important than team wins.

But when you get right down to it, wins are the destination. The trick is knowing what statistical roads lead to them most directly. That's where it gets interesting.

Every executive has his own allegiances, whether to on-base percentage (Billy Beane), RBI (Omar Minaya) or opponents' batting average (Joe Garagiola Jr.). And every statistical analyst, from Pete Palmer to Bill James to the folks at Baseball Prospectus, has spent hundreds of hours coming up with measures beyond the traditional.

This is my encapsulation, after consulting with both front-office personnel and sabermetricians, of the statistics that matter most in the game today:

I think the most important stat is how many strippers sit on your leg the night before at the Foxy Lady, but...see More for the article (mostly known stuff, no win-shares or range factor or anything, but a fun read)

ESPN.com: Baseball

Wednesday, January 21, 2004
WHIP it good ... statistically
By Alan Schwarz
Special to ESPN.com

Yes, we know. When you talk about statistics, you can't find one more important than team wins.

But when you get right down to it, wins are the destination. The trick is knowing what statistical roads lead to them most directly. That's where it gets interesting.

Minaya

Every executive has his own allegiances, whether to on-base percentage (Billy Beane), RBI (Omar Minaya) or opponents' batting average (Joe Garagiola Jr.). And every statistical analyst, from Pete Palmer to Bill James to the folks at Baseball Prospectus, has spent hundreds of hours coming up with measures beyond the traditional.

This is my encapsulation, after consulting with both front-office personnel and sabermetricians, of the statistics that matter most in the game today. There are two important considerations before we start:

1) By "matter," we mean vital to the people evaluating talent, whether they be executives, fans or media. Each group has its own considerations, because ...

2) There are two kinds of baseball statistics -- those that evaluate what has happened, and those that evaluate what will happen. They are vastly different, and confusing one for the other leads to disaster. For example, if someone has a fantastic batting average with runners in scoring position, he was most assuredly valuable. Writers and fans will cast him as a hero. Studies show, however, that his GM had better not count on him being so clutch again next year. On the other side of the coin, while GM's look for good strikeout rates in pitchers in projecting toward the future, other stats tell more about present effectiveness.

We will look at reasonably mainstream statistics only. (Though they are quite interesting and valuable, statistics such as Bill James' Win Shares and Clay Davenport's Equivalent Average remain too esoteric for the masses.) For context, after each one I have listed the 2003 leader in the category and last year's average for regular players -- defined as the 165 hitters and 92 pitchers who qualified for the batting title.

1. OPS
2003 Leader: Barry Bonds, 1.278.

Barry Bonds
Outfielder
San Francisco Giants
Profile

2003 SEASON STATISTICS
AB BA HR RBI OBP OPS
390 .341 45 90 .529 1.278

2003 Regular Average: .810.

No modern statistic has inspired more allegiance than OPS. Adding a player's on-base percentage and slugging percentage gives you a very simple and accurate appraisal of his skills in both key areas of offense: getting on base and advancing runners. First tried by Branch Rickey and his Brooklyn Dodgers statistician, Allan Roth -- who didn't quite use slugging percentage but we'll cut them some slack -- OPS got its big break in May 1984 when, after Pete Palmer and John Thorn wrote "The Hidden Game of Baseball," The New York Times ran weekly charts of baseball's OPS leaders. Multiplying on-base and slugging percentages actually is more accurate, but the ease of adding them allowed the public -- and even baseball executives -- to catch on to the power of non-traditional statistics.

2. WHIP
2003 Leader: Jason Schmidt, 0.95.

Jason Schmidt
Starter
San Francisco Giants
Profile

2003 SEASON STATISTICS
IP W-L SO BB ERA WHIP
207.2 17-5 208 46 2.34 0.95

2003 Regular Average: 1.31.

Essentially baserunners per inning, Walks plus Hits per Inning Pitched is the hurler's version of on-base percentage allowed. It doesn't take into account extra-base hits -- OPS allowed would be more helpful, of course -- but WHIP is more common and available, having hit the mainstream when it was included as one of the eight statistics used in Dan Okrent's original 1980 Rotisserie League. WHIP does a good job at looking past a pitcher's wins, and even ERA, to see how effective he truly was. As for predicting his future effectiveness, other statistics (see No. 6 below) must be taken into account as well.

3. Run differential
2003 Leader: Braves (+167).

2003 Average: 0 (by definition).

In general, it doesn't matter much if a team's pitching is stronger than its offense, or plays in a good hitters' or pitchers' park. No matter how you do it, outscoring your opponents by, say, 800 runs to 750 over the course of 162 games should -- after applying what Bill James called his Pythagorean formula, though actual translations can vary -- leave you with a record of about 86-76.

The power of this approach lies in comparing the expected won-lost record with what actually happened. A team that overperforms compared to its expected W-L was probably lucky to some extent and is a strong candidate (assuming the same roster) to slip the following year. Conversely, an actual W-L poorer than expected portends future improvement.

Looking at run differentials shows that last year's Braves and Phillies were probably closer in talent than their records showed. The 101-61 Braves had a differential that translated to an expected 97-65 record, while the 86-76 Phillies "should" have gone 91-71 -- just six games worse rather than 15. Whether due to bad luck or bad managing by Larry Bowa, the Phillies had every reason to expect a closer race in 2004, even before their trades for Billy Wagner and Eric Milton.

Don't worry, you don't have to figure the expected records yourself -- for full 2003 "Pythagorean" standings, click here.

4. On-base percentage
2003 Leader: Barry Bonds, .529.

2003 Regular Average: .351.

Of course OPS is better, but among the pool of official MLB statistics, on-base percentage ranks as the most important. Extra bases are vital, but a lineup that preserves its outs and tires pitchers faster by taking pitches is even more deadly. A high OBP shows just how much batters such as Brian Giles contribute to the offense, and a low one (particularly for a young hitter) can be a warning signal that some work on strike recognition is necessary, or else pitchers could very well figure him out. For a more specific look at control of the strike zone, something particularly helpful in projecting minor league hitters, strikeout-walk ratio is helpful as well.

5. Slugging percentage
2003 Leader: Barry Bonds, .749.

2003 Regular Average: .459.

Just as above, looking at SLG without OBP is like subsisting on food without water. Both are necessary. Slugging percentage gains some greater importance during high offensive eras, like the one we're in now, as runs are slightly easier to come by before the out clock runs out. You can look past a low SLG for a lineup's No. 1 or No. 2 hitter, because his primary role is to get on base, but the 3-7 batters must be able to do more than play station-to-station ball. One warning: Slugging percentage is more vulnerable than OBP to the batter's home ballpark dimensions.

6. Strikeout rate
2003 Leader: Kerry Wood, 11.35.

Kerry Wood
Starter
Chicago Cubs
Profile

2003 SEASON STATISTICS
IP W-L SO BB ERA WHIP
211 14-11 266 100 3.20 1.19

2003 Regular Average: 6.21.

Otherwise known as strikeouts per nine innings, one must look at this statistic to predict a pitcher's future performance (particularly young ones) with any confidence. It can be a great measure of what scouts call "stuff" -- the ability to make batters swing and miss, which is vital for all pitchers but the freaks like Jamie Moyer.

Perfect example: Remember Allan Anderson, the 24-year-old lefty who won the American League ERA title in 1988 at 2.45? He struck out just 83 batters in 202 innings, or a paltry 3.70 per nine innings, which indicated that he didn't have great stuff, and that batters would soon catch up to him. That they did; Anderson was out of the majors four years later.

Bill James helped pull the curtain on these types of pitchers in the early 1980s, and the future executives who read him -- and now populate front offices around the majors, especially in Boston -- use the statistic regularly in evaluating talent.

7. Earned-run average
2003 Leader: Pedro Martinez, 2.22.

Pedro Martinez
Starter
Boston Red Sox
Profile

2003 SEASON STATISTICS
IP W-L SO BB ERA WHIP
186.2 14-4 206 47 2.22 1.04

2003 Regular Average: 4.09.

ERA is a perfect example of the past vs. future debate. It falls short in predicting which starter will pitch best, but in terms of citing who has pitched best, it's as important as conventional statistics tend to get. It certainly beats won-lost record, which is far too dependent on a pitcher's run support and the performance of the bullpen that follows him. ERA requires you to still take the pitcher's home park into account -- as well as any oddly small or large totals of unearned runs -- but his charge is to give up few runs, period. If he succeeds at that, he has done his job.

8. Defensive efficiency
2003 Leader: Mariners, .731.

2003 Average: .710.

Even Henry Chadwick, the 19th century father of baseball statistics, knew that putouts, assists and errors by themselves are a horrible way to judge fielding. Errors don't measure how many plays a fielder successfully makes; fielding percentage doesn't measure plays per game; and plays per game, which Bill James called Range Factor but was first proposed by Chadwick, doesn't account for groundball and flyball pitchers, or where the fielder was positioned before the ball was hit. Cats have a better chance of catching their tails.

Team defense is a different matter, though. The object of the defense as a whole is to turn balls hit into the field of play into outs. The rate at which a team has done that is called its Defensive Efficiency.

By that measure, Seattle -- with great thanks to Bret Boone at second base, and Randy Winn, Mike Cameron and Ichiro Suzuki across Safeco Field's large outfield -- ranked best in baseball at .731. While many cite St. Louis as having a great defense because of their several Gold Glovers (Scott Rolen, Edgar Renteria, Jim Edmonds ...), the Cardinals finished a surprising 11th at .712.

Many studies over the last three decades have suggested that defense is less important than many people think, because the difference between the best and worst teams amounts to no more than about one play per game. But if you want to know which team's pitchers are getting the best glove support, Defensive Efficiency is a great place to start.

Alan Schwarz is the Senior Writer of Baseball America and a regular contributor to ESPN.com. His first book, "The Numbers Game: Baseball's Lifelong Fascination With Statistics," will be published by St. Martin's Press in July.

ESPN.com: HELP | MEDIA KIT | CONTACT US | TOOLS | SITE MAP
Copyright �2004 ESPN Internet Ventures. Terms of Use and UPDATED Privacy Policy and Safety Information are applicable to this site. Click here for a list of employment opportunities at ESPN.com.

posted by prof_booty | 02:52 PM

Features

Curriculum

Information

01/21/2004: Arcanum

More

1 Annotation Submitted