Good Baseball Writing
This article includes affiliate links: click at your own risk.
I made a new friend recently. The funny thing, though, is that I don’t know his name, have only a vague idea of where he lives, and know him only through anonymous interactions on an infamously frivolous website.
My friend is Not Gaetti, the owner of a Twitter account that started to explode a few months ago. My friend gives voice to the voiceless, making witty posts that poke fun at modern analytics and provoke dozens of angry replies.
But this isn’t an article about the good and bad of modern analytics, or even about deliberate internet trolling.
It’s about writing.
Do you know why Not Gaetti gets people to read his content? It’s not because of his sarcasm, or his humor, or the fact that he’s taking a contrary position. It’s got nothing to do with social media being a magnet for negativity, or for whatever cynical explanation the critics of social media bandy about.
Nope. Not Gaetti gets people to read his content because he’s a good writer.
Here’s a good example:
Now, I don’t know how much exposure Twitter’s algorithm gave this post. This is a random Tweet I found within the last month that proved to be popular.
It’s also very easy to understand.
Yes, I know that Nolan Ryan’s high walk counts were a problem. I know that all the things we say about how durable he was really weren’t true: he had numerous health problems in the 80s, problems that we forget about in the warm sunlight of hindsight. I know what his WAR looks like, and I know what the analysts say.
But they can’t say it as simply as this.
We all know what a strikeout is. We know that it isn’t easy to get a batter to strike out — though it does seem easier these days than it once was.
We also know how long a year is, of course, and we have a feeling for how long an average career lasts.
Now, 285 strikeouts for a pitcher in a single season is a lot — even in our day of free swinging and swiss cheese bats. Doing it 20 years in a row is unfathomable. In fact, I’d bet that most pitchers these days aren’t going to have 20 year careers, no matter how many times they get Tommy John surgery.
And all of that would still put you behind The Ryan Express.
That is the way to write about baseball.
On Writing Well
I know that this is the most pretentious article I’ve ever written.
It’s hard to write about the act of writing when you’re not really that thrilled with your own writing. I don’t show it all that much, but the secret is that I become really frustrated with my own lack of writing skills quite frequently.
I also find myself falling back into easy conventions and well-worn tropes. And, honestly, it makes me less willing to put in the effort I need to write something worth reading. After all, it’s much harder than it looks.
I was thinking about this earlier this morning while I was reading On Writing Well by William Zinsser. Yes, that’s right: those of us who aspire to be “professional” writers actually do take the time to read about how to write. No, we’re not born with it. In fact, if you asked my 7th grade English teacher, she’d tell you I’m the last person in the world who should try to make a living at doing this.
Anyway, Zinsser takes time to write about good sports writing, and mentions baseball specifically. If you love sabermetrics, this passage should feel at least a little uncomfortable:
You can see why the article Zinsser quotes is bad, right? It’s nothing but stats and figures. We don’t know anything about Pat Sullivan, even though his team won a decisive victory. We know a bunch of statistics about pass-happy John Reaves. We also know a little bit about Zeke Bratkowski, who must have been happy to see that his record was now demoted to a footnote.
It’s basically the prose equivalent of showing somebody a spreadsheet. Yeah, there’s data there — but it’s nothing but numbers. It doesn’t tell a story. And, worst of all, numbers are easy to come by. Anybody could write this, and yet nobody wants to read it.
Zinsser’s advice? It’s the same as what Not Gaetti has discovered. Look for the human element, talk like an actual person, and you’ll be fine.
Fixing Bad Baseball Writing
Now comes the fun part.
As much as I like to dunk on WAR and other modern statistics, the real problem isn’t the stats. The real problem is the way people talk about the stats.
Don’t get me wrong: I know that Twitter is an awful place to talk about abstract things. It’s hard to fit complex concepts into short tweets, and you know that your multi-tweet threads will almost certainly go unread. However, the lowest hanging fruit is on Twitter, and so it is to Twitter that we go. I’ll keep these quotes anonymized to protect the innocent.
Here’s a question that I consider quite valid:
Makes sense, right? If strikeouts don’t really matter for batters, why should we care about how many strikeouts pitchers record?
Check out this response:
All right — this is where you’ve lost me.
I mean — what do we mean by “batted ball quality?” And how is it relevant to the subject?
I assume that this is a variant of the Fielding Independent Pitching argument that is currently in vogue, the argument that pitchers can’t really influence where a batter hits the ball. If you follow the logic behind that assertion (logic that is extremely counterintuitive), you can see why we’d want pitchers who specialize in avoiding contact.
If that’s the case, why not just respond with something like “Because pitchers can’t control where batters hit the ball?” Why bring up “batted ball quality?”
The second part is the same unfiltered statistical stream that Zinsser quoted above. In fact, it’s worse, since there is no context. Am I supposed to be impressed with Nick Pivetta’s strikeout percentage? Do we automatically assume that wRC+, which is an extremely abstract and normalized measure of offense, is the most natural way to measure offensive efficiency? Is a 25.2% strikeout rate for batters surprisingly high, average, a bit low, or what? It’s impossible to say without context.
Yeah, I know it’s a tweet, and that there’s only so much room. Why not just say “Pitchers can’t control where batters hit the ball, but it seems that batters can control that,” and keep it at that?
Let’s try another.
You’ll be able to figure out who wrote this without much research — my apologies. And, yeah, I know it’s a chart, not prose. But we still need to talk about it.
First of all, there’s no pattern here whatsoever. Why present this as a scattergraph? The only interesting information is at the extremes: Kensuke Kondoh and Tomoya Mori, who seem to provide extremely good value, and poor Sheldon Neuse, who can’t seem to do anything right.
Most players are scrunched together at the middle. That’s because we’re measuring wRC+ against whatever “defensive value” is supposed to be. Since wRC+ is a normalized statistic, we assume that it’s going to center around 100, as it does here. I’m assuming that “defensive value” uses 0 as its base number.
And that’s the reason why this graph is so bizarre.
Shota Morishita seems to be the personification of Joe Sixpack here, the most average possible offensive and defensive position player in NPB. Why is he remarkable? What about the 20 or so guys right next to him?
Why not write a single tweet about each of these guys?
I didn’t know this until I looked it up, but Neuse played for Oakland last year, striking out 80 times in 293 plate appearances, hitting .214, getting a handful of extra base hits, and overall playing quite poorly. I’m not sure why Oakland picked him up; he was even worse for the Dodgers in 2021, hitting a weak .169 with 3 home runs in 66 plate appearances.
Why not talk about Sheldon’s woes? I would think that he’d be able to turn things around for the Hanshin Tigers this season. He is hitting .250, but his power just isn’t there. That brilliance he showed in Las Vegas in 2019, where he hit 27 home runs, apparently shows that he can hit AAA pitching but nothing more. Maybe pitching in Japan is better than the Pacific Coast League, or maybe the food and climate is getting to him, or maybe, at the ripe old age of 28, he just can’t adapt the way he once could.
There are stories to be told here, but they’re lost in the ridiculous scatterplot graph for a statistic that really shouldn’t be graphed.
Here’s one more, and then I’ll shut up:
This was part of a long thread comparing Yadier Molina to Jorge Posada.
Now, I know that discussions like this will be heavy on the statistics. However, this tweet is a particular offender of throwing out stats and terms that most people don’t know.
We know what hits are, what doubles are, what batting average is, what stolen bases are (but do we really care about how many times catchers steal bases?), and what All Star Game appearances are. I’m not worried about any of that stuff: that boils down to an argument about which stats are actually relevant, not about language usage or clarity.
Now, “GG” here means “gold glove.” Gold gloves are one of those metrics that modern statistics hate because of their subjectivity. So why bring it up here?
And what is a “platinum glove?”
Well, it turns out that it is an award determined by fan voting for the best gold glove recipient in each league:
Jorge Posada never won a gold glove, which means he was never eligible. He also retired the year that the Platinum Glove award debuted. In other words — if one player couldn’t possibly have qualified for the award because it didn’t even exist at the time, why cite it in a comparison?
Designated Hitter starts would seem to be an argument in Posada’s favor, right? I mean, if you’ve got a good hitting catcher who needs a day off, you’d want him to be the designated hitter for you, wouldn’t you? The idea implied here, that the Yankees had to stick Posada in the DH spot because his fielding was so atrocious, is bizarre.
DRS is similar to “defensive value” above. It’s Defensive Runs Saved, a modern statistic that uses proprietary information from Baseball Info Solutions to chart where each ball is hit and assign it a numerical score. It’s not entirely arbitrary, though I would argue that you can’t call its methodology completely objective: after all, it’s all about assigning numbers to certain plays based on charts, graphs, and comparison to other similar plays. It’s not as bad as getting the sportswriters together to vote on the gold glove winner, but it’s also not quite the same as counting the number of times the player hit a home run.
Oh — and we only started measuring DRS with the 2002 season. So much for the first 7 seasons of Posada’s career!
Caught stealing percentage is also not a great indication of catcher ability for pretty obvious reasons. If you’ve got a good runner on first base against a catcher with a poor arm, you’re more likely to attempt to steal second than if the catcher has a good arm. At some point in time you just don’t attempt the steal unless you know you’ve got an excellent chance at making it — which, naturally, is going to skew the percentage of base stealers that the catcher throws out.
The same thing applies to outfielder assists, by the way. After watching Dave Parker throw two runners out at third base in the 1979 All Star Game, you probably aren’t going to tell your runner to take an extra base against him.
We can argue back and forth about the applicability of these statistics. However, as you can see, the actual argument that our anonymous poster wants to make is clouded under all of this information.
If the argument is “Molina was a better fielder,” I think he’s got a good point. We could have a separate discussion in which we talk about caught stealing percentages, DRS, Rfield, or whatever other statistic we want to play with. However, vomiting out statistics like this only makes readers feel confused and frustrated.
Now, the problem with simply saying that Molina was a better fielder (which I think he was) is that you’ve got talk about how you balance hitting and fielding. That’s a much different discussion — one in which you can’t just throw statistics at the wall until something sticks.
But isn’t that a more interesting discussion than the normal alphabet soup of numbers and letters?