Monday, September 10, 2007

Park factors revisited. And recalculated.

The story so far:

Now, I'd like to recalculate the IBL park factors, and do it right this time.

Recall my earlier methodology: Average the performance at each field and divide it by the average performance across the IBL. But since this approach overweights the home teams for each field, I averaged performances on a per-team basis, and weighted the six teams equally at each field. As I noted, the flaw here is that the home teams are still overweight in the fielding side. The effect of Bet Shemesh's and Modiin's sluggers on the Gezer averages is reduced, but the effect of their pitchers is enhanced.

To check the magnitude of this problem, I applied the same methodology to pitching statistics. Same numbers of runs and hits, but accorded to the fielding team, averaged according to team weighted by the number of games each team played at the field. This time, I would expect the home team hitting to be overweighted. Sure enough, Gezer's home run ratio shot into the sky. Instead of the 1.32 park factor relative to average calculated in terms of batting statistics, pitching statistics yielded a factor of 1.78 - compared with just 0.53 at Yarkon (down from 0.63).

Weighting by games played per team meant that games played by visiting teams facing Bet Shemesh and Modiin were overweighted, and the influence of the home runs they gave up were exaggerated.

So I discarded this method. It doesn't make much sense anyway.

Instead, I set out to compute "conventional" park factors.

But with the IBL, nothing is conventional. I can't just compare home games with away games, since some of the "away" games were played at the home field against each team's partner sharing the field. I could just compare all games played at Gezer with all of Gezer's home teams' away games, but then the schedule would be unbalanced, since Modiin and Bet Shemesh never played each other outside Gezer.

Also, other games were played at Gezer (and Sportek and Yarkon) without the participation of the fields' home teams at all.

So here's the idea. For each field, find all "matchups" of two teams which played at least two games at that field, as well as at least two games at other fields. That is, we're building a list of teams for which we can compare their play at Gezer with their play elsewhere. Then compare the average performance of all the Gezer games on the list with the average performance of all the other games on the list (and add a correction factor to give the result in terms of a multiple of the average field).

I didn't know how balanced the lists of games would be. I was pleasantly surprised. I'll give you the details when I have more time. For now, just the bottom line: The park factors, calculated "conventionally" by comparing games by the same pairs of teams at one field versus games played by the same pairs of teams elsewhere.

Note that the home run factors are bigger than before: 1.41 at Gezer and 0.51 at Yarkon, compared with 1.32 and 0.63 before. This I attribute to the fact that the earlier approach actually underweighted Gezer's home teams. But still, the range of park factors for runs and hits is not very large, even surprisingly narrow.

The IBL fields may be very different, but I don't think you can fairly call any of them a hitter's or pitcher's park.

No comments: