Monday, November 26, 2007

Blog roadmap

An anonymous commentor has asked when I'll address IBL pitching. Please read the exchange between us, which touches a bit on IBL pitchers and how to assess them.

In response, I thought I should let you know what I'm planning to cover in the future, time permitting. Let me know if I'm missing anything of interest, or if you have any other comments about the agenda. Or if you'd like me to focus on one topic before another - these are in no particular order.


  • Finish the batting production leaders charts: runs created per plate appearance, park-adjusted figures.

  • Calculation of score rates per runner type and estimation of runs created based on them.

  • Leaders in net runs created and lost due to base stealing.

  • Looking at frequency of taking the extra base on hits.

  • Charts of leaders by various raw pitching stats.

  • Thoughts about how to evaluate pitchers with so few starts and such unbalanced schedules.

  • Actual assessments of pitcher value, including DIPS (defense-independent pitching stats).

  • What do we really know about it in the IBL?

  • The splits: Breaking down team stats by field, opposing team, day of week, week of season, inning, etc.

  • Compilation of IBL run expectancy charts by outs and baserunner situation.

  • A look at reported attendance figures.

Can't promise how long it will take me to get to any of this... I do have other things to do with my life, believe it or not!


3rd Base Coach Perlman said...

Those Blue Sox were unbelievable on the basepaths! Lyons, Slaughter, Reese Raymundo, Jamark... The third base coach must have known what he was doing :) because they were able to steal 115 bases and were only caught 18 times for an 86% success rate. The only team with a better rate was the Miracle, and they only stole 33 bases as a team! Mike Lyons, you are the most underrated player in the league!

iblemetrician said...

Coach Perlman,

Nice to have you here.

Bet Shemesh really did tear up the basepaths. But how much did it actually gain them?

If my linear weights estimates are correct, each stolen base was worth on average just .091 runs (11 SBs = 1 run), while each time caught stealing cost a team on average 0.437 runs, or 4.8 times as much.

That makes some sense, since with the high on-base rate and slugging rate in the IBL, a baserunner is highly likely to be knocked home regardless of which base he stands on.

Overall, that means Bet Shemesh picked up an estimated 10.5 runs by stealing bases, but they lost 7.9 by being caught, for a net gain of just 2.6 runs for the season.

As it happens, that led the league. Modiin, with far fewer steals, gained a net 1.3 runs, and the other four teams all had negative balances. The worst base stealing record was Tel Aviv's, with 72 steals and 25 times caught, for a net of -4.4 runs.

Was it really worth it?

Coach Perlman said...

While I do have a pretty good understanding of stats, I have never gone as in depth as you have. From a coaches point of view, a field like Gezer with it's short fences, I see more than stats. I see outfielders playing shallow so it might take two base hits to score a runner from first, and I certainly wouldn't be able to advance a runner from 1st to third on a routine single (there were very few doubles at Gezer, and almost no triples. I see opportunities to stay out of a double play when I have a slow runner at home, or at least a chance to force the pitcher to leave a ball up in the zone rather than attack in the dirt with a breaking ball (Your stats won't show how many runs were scored because the player was already in motion but not credited with a stolen base). Without that pressure, I'm sure the homerun and RBI totals of Reese, Lopez, and Raymundo would have been much lower as would on base percentage. I like to believe that I was able to get those players into better hitting situations by running so frequently, and even their outs are more productive because they are more likely to move runners over to third where a passed ball can score a runner. Maybe the stats according to your formula don't show it clearly, but I do know this:

Blue Sox:
HR-60 DP-20 SO-198 SLG%.488 OB%418
HR-52 DP-23 SO-190 SLG%.492 OB%418

These stats are virtually the same, so why did we score 286 runs while they scored 218?

I say stolen bases!

iblemetrician said...


Thanks for your detailed insights. Stats must always be grounded in the reality of how the game is played, and hearing your perspective will hopefully help me better understand what the stats are saying.

I freely admit that my run estimates may be off. I'm hoping to try another approach for estimating offensive event weights, and it's quite possible I'm understimating the value of the steal. When I have something I'll let you know.

It's true that the raw stats don't tell me who's taken extra bases on hits. I plan to extract that data from the play by play logs, and I'm pretty sure it will bump up Mike Lyons's figures. I also expect to see that runners get more bases at Yarkon than at Gezer, perhaps explaining why a few more runs were scored there than expected.

It would be interesting to see if in fact, as you suggest, batters get on base more often when there's a speeedy runner on base ahead of them putting pressure on the defense.

I do have to correct a few of your numbers, though:

there were very few doubles at Gezer, and almost no triples

Actually, there were no triples at Gezer, 6 at Sportek and 13 at Yarkon.

Gezer was, however, good for doubles, with a double in 5.2% of all at bats, compared with 4.4% at Sportek and 4.3% at Yarkon. I estimate Gezer's park factor for doubles at 116; that is, there were 16% more doubles at Gezer than would be expected at an average IBL park. Why this was I can't say (at least not yet).

Gezer was in fact weakest for singles, allowing about 8% fewer than the average field.

Finally, I'm not sure where you got your stat lines for Bet Shemesh and Modiin. The figures I have, which match those from the IBL's website, are:

Blue Sox:
HR-60 DP-20 SO-198 BB-186 ROE-53 SLG%.515 OB%.413

HR-52 DP-23 SO-192 BB-157 ROE-33 SLG%.458 OB%.373

I've added walks and times reaching base on error (ROE).

Bet Shemesh got on base more often and had a higher slugging percentage, as well as reaching base on error more often. Counting errors as hits, the averages rise to SLG%.563 and OB%.451 for Bet Shemesh, compared to just SLG%.488 and OB%.398 for Modiin. I think it's clear where the difference in runs came from.

In fact, both of my current run estimation formulas overestimate Bet Shemesh's production by 1-2%, while underestimating Modiin's by 2.5-4%. That might indicate that Bet Shemesh actually scored a bit less than would have been expected given the stats.

Coach Perlman said...

My mistake on OB% and SLG%, I was reading the wrong line on the IBL site - looked at the bottom line and thought it was a total for the team.

Coach Perlman said...

I will say that when you run more, you see more fastballs, OB% and SLG% usually increases, as do opponent errors due to more hard hit balls, as well as errors on catcher throws, and errors on middle infielder who may be moving toward a base when a ball is hit.