Tuesday, October 16, 2007

More on league quality estimation

The previous post on estimating the level of play in the IBL generated some interesting comments, including on the Baseball Fever Sabermetrics Forum and Tom Tango's blog. Also, Rabbi Jason Miller noticed my citation of his game observations, and commented.

I'd like to respond to the comments, and add some more observations of my own.

Why errors and steals?

Tango is surprised that error rates and stolen base rates correlate at all with the level of the league. After all, the reason batting averages, or walk and strikeout rates, don't track the league level is that they are the result of the confrontation between the batter and pitcher/fielders. Better leagues have better hitters, but also better pitchers and fielders. On the whole, they balance each other out, so the majors don't have higher batting averages or walk rates than weaker leagues. Sometimes pitching overpowers hitting or vice versa, but there's no connection between the relative strength of hitters and fielders and the overall level of league play.

You might expect the same to apply to errors and stolen bases. An error is not just the fault of the fielder. Some batters consistently reach base on error far more often than other batters, presumably because they're hitting more hard-to-field balls. Shouldn't that balance out the stronger fielding in the stronger leagues?

A stolen base certainly is not the sole fault of the fielding team; arguably, it's first of all a skill of the baserunner. So why should weaker leagues have higher steal rates? Don't they have less skilled runners?

On the one hand, the graphs speak for themselves. The correlations between league level and error rates per at-bat (0.93) and stolen base rates per runner on base (0.85) are stunningly strong. If you leave out the inconsistent rookie leagues, they're even higher (0.97 and 0.88 respectively). But that doesn't absolve us of an explanation.

The answer, I think, is that the league-level variations we see in both error rate and steal rate are primarily factors of the quality of the fielding. It may be true that some hitters are better able to hit balls that are hard to field, but at lower levels of play that's not the main factor in producing errors. To quote myself:

What I think you're seeing with the top major leaguers is an ability of exceptional batters not just to "hit it where they ain't", but also to "hit it where it's hard to field". What I think we're seeing with high overall league error rates in the minors is at the opposite end of the defensive ability scale - not balls hit where it's hard to play them, but routine plays that the sub-major-leaguers flub: dropped catches, wild throws, bobbled grounders.

That is, I suspect that the further you go down the ability ladder, the more errors reflect unprofessional fielding rather than skillful batting. Hence, overall higher error rates in overall weaker leagues.

A similar argument can be made regarding steals. While running speed is important in baseball, it's not necessarily that much higher in the majors than in weaker leagues. What is substantially higher is fielding ability, as a result of more experience and winnowing out the poor fielders. Plenty of minor league players can run as fast as their major league counterparts, but they aren't as practiced at holding runners on base and picking them off at second.

The upshot of this analysis is that both of these measures are, at least at league level, essentially indicators of fielding ability. We still have no independent measures of league level based on batting ability or pitching ability. The assessment is very one-dimensional. Unfortunately, stats such as wild pitches or hit batters do not seem to be available for the minor leagues; they could be good indexes of pitcher skill.

More about the stats and graphs

Tango is probably right in suggesting that I had the denominators wrong - errors should be measured per at-bat, and steals per runner on base. In practice, though, those changes don't affect the results in any significant way.

On reflection, I would drop the "unearned runs" and "defense efficiency" measures. The former is just a roundabout and unreliable way of measuring the error rate - it might be useful if you don't have error stats, but it's generally better to measure errors directly. The latter measures the defense's success in putting out batters on balls in play. However, the correlation between batting average on balls in play (BABIP = (H - HR) / (AB - HR - SO)) and league level is very weak (see below). In practice, then, the DER graph is also just another way of measuring the error rate. That leaves us with two relevant stats: errors per at bat and stolen bases per runner on base.

We can plot them against each other for another picture of the league quality level (click to enlarge):

In this graph, I've indicated the league level by the plot symbol: blue spheres for the majors, green spheres for AAA, gray spheres for AA, red spheres for A+, gold spheres for A, gray diamonds for A-, orange spheres for rookie leagues. Three independent leagues have been marked with stars: the Atlantic League (red), Canada's Intercounty Baseball League (orange), and the Israel Baseball League (blue). The regression line is based only on the majors and ranked minor leagues, including the rookie leagues but excluding the independents.

With the exception of the steal-frenzied IBL, the relationship between the steal rate and error rate is clear and strong (0.92 for the ranked leagues). Also, the grouping of leagues by level is mostly distinct. AAA and AA seem quite close in level here - maybe fielding levels aren't different enough to distinguish between them. Note that the Atlantic League falls in the AA-AAA area, as both the league and observers generally claim. A and A- leagues are quite close, but A+ is clearly at a rank of its own. And the rookie leagues show a wide range of levels, but they cluster quite close to the SB/E regression line (with the Canadian IBL somewhere in the middle).

Arguably, the distance along this line could be used as an estimate of league quality, at least as indicated by fielding ability. I'll try to calculate those estimates, time permitting.

Without further ado, here's the graph of BABIP I promised. There's a correlation between BABIP and league level, but it's weak (0.33), and not much value in assessing league quality.

A final comment on the stats. Sabermetricians have often derided the error stats and fielding percentage, not without good reason: "Errors and therefore fielding percentage are an inadequate way of measuring fielders because of the subjective nature of the decisions and because they only record failures and thus fail to take into account the fact that good fielders cover more ground and therefore record more outs" - Dan Agonistes.

But in the aggregate, I think I've shown that errors are a relevant measure of league quality level, and one of the few such measures that are widely gathered and published for baseball leagues of all levels of play. Keep that in mind next time someone touts his new top-secret formula for assessing fielding ability or league quality.

And now back to the rabbi.

Rabbi Miller defers to the judgment of Jay Sokol, who attended the IBL game with him:
Jay is the General Manager for the Delaware Cows of the Great Lakes League, which is a summer league dedicated to helping college players get used to the wooden bats they'll use in the minor leagues. Jay thought the level of play in the IBL was very similar to the wood bat summer league. He even recognized an IBL player whom he previously scouted for the Cows.

I certainly defer to Sokol's baseball judgment - I'm just a fan and a novice sabermetrician. I would point out, though, that the game they watched was between Netanya and Raanana, two of the IBL's weaker teams (at least until Netanya's closing weeks). The game's box score and play-by-play log indicate that Raanana committed five errors - high even by their own averages (2.1 errors per game, the highest in the IBL). So I wouldn't rely on a single game to assess the IBL's level of play. But thanks for the input!


Rabbi Jason Miller said...

You are solely basing your analysis of the level of play on statistics. I find this very interesting (I loved reading "Moneyball" by Michael Lewis), but I am also interested in those factors that cannot be determined based on statistics alone.

The majority of the players in the IBL were foreigners, spending the summer in Israel. I am sure that being thousands of miles from home and living in a foreign country where you don't speak the language has a psychological effect on the players' level of play. While many of the IBL players had other international playing experience and were used to playing in a foreign country, there must be an advantage when a minor league (AA or AAA) or college player is playing in his home country.

Secondly, I wonder if the fact that it was the first season of this league was a factor in the league's level of play. There were some organizational problems at the beginning of the season that might have contributed to the level of play. Further, I suspect that the quality of the playing fields has a psychological effect on players' performance. The IBL fields were similar to high-school fields in the U.S. in my opinion. If these same teams played in minor league stadiums, perhaps they would subconsciously play at a higher level.

The best way to find out how the IBL stacks up against other baseball leagues around the world, however, would be to ask the players.

iblemetrician said...

Thanks for the comments.

I'm certainly not basing my assessments solely on statistics. If you go back to my original posting on the subject, you'll find that the first two approaches I list for assessing league quality are "Subjective impressions" and "Where the players come from", both of which point to a level of play somewhere between college and the lower levels of the minor leagues.

Since I'm not a social scientist, I don't intend to survey the players and coaches on their assessments of the league. But I agree that it's a valid approach. (I'm not sure it's the best way, but it's certainly a legitimate one.)

The problem with subjective impressions is that they are, well, subjective. Statistics can offer an objective method to confirming or rejecting subjective impressions. If the statistical analyses were far off from the subjective impressions, I'd have more cause to question them.

(Certainly, the statistical techniques available for assessing league quality at these levels of play are of very limited accuracy and should only be seen valid in a broad sense. Especially given the limited data available from the IBL's short season, all analyses should be taken with a grain of salt.)

The psychological issues you raise are interesting, but I'm not aware of any meaningful way to study them. Certainly not in my fields of expertise!

Regarding the startup glitches - and the fact that teams had no "spring training" - I think it would be valuable to look at how the quality of play changed over the eight-week season.

Tangotiger said...

The park for fielders and the runs per game also highly influence errors and steals.

Since parks are not randomly distriuted among rookie ball through MLB, you have a very real bias here.

There's also potential bias for scoring errors.

As for runs per game, the more the runs, the less likely you are going to steal. Simply put, the break-even point goes up, as the number of runs scored per game goes up.

Finally, the younger you are, the more you steal. The aging patterns clearly shows a peak on steals right around 23 years old. If you have a league filled with 19 and 20 yr olds, of course they will steal alot more. Furthermore, your strength (throwing arm for catchers) probably peaks much later.

What you are witnessing has alot of biases. Unless you account for each one, what you are witnessing is not necessarily the result of the one you are hoping for.

You may be proved right in the end, but you haven't done so yet.

iblemetrician said...


I appreciate your constructive comments.

The park quality distribution issue doesn't explain the long-term reduction in error rates in the major leagues - unless you claim park quality has improved over time, which is certainly possible but far from a given.

Regarding scoring, I suspect that even though scorers are supposed to use an "ordinary effort for average fielder" standard for judging an error, in practice they just think "he should have had that" and judge a play an error even if it was beyond the ability of the league average fielder. So weaker fielding skills do show up as higher error rates.

Steals may indeed be less valuable as the run level increases, but I haven't seen any correlation between the league quality level and runs per game, whereas there is a strong correlation between league level and the steal rate, at least among the contemporary minor leagues (this does not seem to hold historically). I don't see any link between the run scoring rate at the different league levels and the steal rate.

In fact, the IBL has by far the highest level of runs per inning, and a relatively high level of runs per game, along with its frenzied steal rate.

(Actually, I would think that the steal rate should be negatively correlated with the home run rate or the slugging average. If you're likely to get knocked in by an extra base hit, why take the risk stealing bases?)

Your point about age is noted. I'll take a look soon at the IBL's steal leaders, and I'll make sure to check their ages.

Not surprisingly, one of the variables which correlates strongly with the minor league level is the average age of the players, but that's obviously a function of the fact that it takes time to progress up through the minor leagues, rather than any indication of baseball skills. I'm well aware that not every correlation indicates causation, and as you say this might explain the steal rate.

Tangotiger said...

Actually, my focus was only on the same-year rates that you are showing for various levels of play, not the year-to-year changes.

You have more steals in the lower levels than in the higher levels because of the age of the players. And you would have an inverse relationship between HR and SB.

The MLB Parks are undoubtedly of better quality than lower level parks.

The scorers are obviously different, so who knows what standards they are all using.


As for the year-to-year changes in errors, again, you have many variables. You have the gloves, shoes, artifical turf, better groundskeeping on the newer natural surface. These are all examples of biases.

iblemetrician said...

Your points are taken. I'm still thinking this over for now.