3 Comments

Daniel, interesting and well-written post. I appreciate your research into original newspaper source material from the 1908 season. I have set-up a replay of another dead ball season--1916 AL--using NP III. With one exception, so far the results are pretty much on target.: low scoring game with dominant pitching. The one exception is that a disproportionately large percentage of runners are caught stealing . In real life, according to both Baseball Reference and Retrosheet, the stolen base success percentage was 76.7%. In my replay, it's only 38.9%. Even accounting for the possibility of data challenges from that era, this does seem to be off significantly. It looks like the same thing may be happening in your 1908 replay. Comments?

Expand full comment

Interesting observation!

How many games have you played? There is a chance that things will even out over time.

I don't think that NPIII necessarily uses the same estimates that Baseball Reference and Retrosheet use. I sent an email to the Baseball Reference team a few months ago asking what the source was for pre-1920 caught stealing numbers. Their response was that they are "based on a formula that Pete Palmer developed based on catcher assists, opponent outs on base and stolen bases allowed."

Now, for a baseball simulator to work correctly, you really have to come up with similar estimates. This is because you need to account for all outs: outs are really the "currency" of baseball (something I plan on writing about later). I would be surprised if Bill Staffa used extremely different numbers from what shows on BBRef and the other websites, though it always is possible.

There are a few somewhat old discussions on the NPIII boards about this issue. You can find one here: https://forums.delphiforums.com/n/pfx/forum.aspx?webtag=skeetersoft&nav=msgwin&search=y&msg=9602.8

My old 1900 replay (the thread is here: https://forums.delphiforums.com/skeetersoft/messages/8607/8 ) had realistic SB and CS numbers by the end. Looks like I had 1,535 stolen bases, compared to 1,686 in real life. I also had 1,415 caught stealing, as compared to 1,351 in real life. That game me a 52.03% successful stolen base percentage, as opposed to the 55.52% success rate in real life.

I did just about nothing to stop players from stealing bases during that replay, which might account for the slight discrepancies. It was still extremely close, however.

Anyway, great point! Caught stealing is one of the really tricky statistics to get right in game design, especially for seasons in which the stats just weren't kept. It's difficult for most games that offer "manual stealing" by the user, since you might have replayers who are either too conservative (resulting in far too few attempts), or are like me and steal far too often (resulting in way too many attempts). Diamond Mind uses a "jump" rating to try to reign people like me in, but doesn't do much to force stolen bases from those who play the game more conservatively.

Expand full comment

Daniel, your points are well taken. My sample size is certainly small, only 8 games. So, it's possible that things could change as I play more games. You're also correct, that record keeping way back in the early 20th century was less than perfect.

My only dead ball replay experience prior to this one was of the 1908 AL in Diamond Mind (v7) using an excellent home brewed database put together by Chris Joyce and Norm Price. As with all of my season replays, I managed for both teams (except for base running on hits and fly outs). I called stolen bases myself. In that replay there were 1316 successful steals out of 1882 attempts (69.9% success rate). That comports very closely with the 1350 in the AL in "real life"according to Baseball Reference and DMB's estimate of a 67% success rate for the "average" runner. Baseball Reference has no caught stealing numbers for the AL in 1908. Apparently, that stat was not kept in 1908. It appears that it was added by 1916.

Of course, even when specific stats were officially tabulated, imperfections sometimes occurred. Ten or 15 years ago I attended a panel at a SABR convention in which an author presented a paper indicating that Hack Wilson's record-setting RBI total in 1930 was actually 191 rather than the 190 it was thought to be for decades. The author combed box scores and newspaper accounts of the Cubs' games in 1930 and was able to convince the powers that be that Wilson had previously been shorted an RBI and the record was officially changed thanks to his efforts.

Daniel, I'm sure you'll agree that we baseball season replayers are a rare breed indeed. At age 79 (I'll turn 80 in July) I just hope that we're not also a dying breed. Before I retired, I conducted market research in the TV industry. Once before I moderated a focus group I showed my client (a dedicated baseball fan, but not a replyer) what I was doing with DMB. Coincidentally, I was engaged in my 1908 AL replay. He could only describe it as "dead man's rotisserie baseball."

Expand full comment