Eno Sarris is one of baseball’s leading analytics writers. I am not.
But I come from a sports journalism background and spend a great deal of time discussing the use of data in sports – both from a media standpoint and a team performance perspective – with those who put data in play all over the world.
Eno and I met to discuss his career, how analytics have changed baseball writing and where he sees the industry going as technology becomes more of a part of the game.
Here’s the conversation in its entirety:
Kevin Chroust: The phrase ‘baseball analytics’ appears right there in your email signature. How did you end up becoming a baseball writer, and what was that journey like for you toward focusing on some of the more advanced stuff derived from the game?
Eno Sarris: I’m actually an immigrant. My parents are German, and I came here to this country in 1986 and one of the things that I struggled with was meeting kids, meeting new people. One of the things I did was meet new people through baseball. I actually really had a connection through other kids. That was really important. I think for other people sometimes baseball (connects) kids, but for me, it was like a connection to the country and to the people and the culture – that was really important to me.
My stepdad also represented someone who was into baseball. Between him and our going to games together and then also baseball card trading, I started to really get into the evaluation of baseball players. It wasn’t just pretending to be Gary Sheffield with the waggle and playing wiffle ball and stuff. It was also: How can I best get the best baseball cards, how can I get the best baseball cards from my friends and give them the worst baseball cards? So from the very beginning, there was a little bit of analysis of baseball that was part of it too. I think I didn’t really expect that I would be a professional baseball player. So from the beginning, analysis was a little bit more important to me.
KC: Do you recall a particular point at which you really began to embrace baseball’s advanced analytics and made it a consistent part of your writing?
ES: I think the first time that analysis was congealed for me was this one baseball card trade I made. I’m talking I’m like eight years old and there’s a Barry Bonds rookie card that I really want. This other guy is a total Braves fan and we live in Atlanta, so I’m like, “Okay, what am I going to do here?” I gave him a Mark Lemke rookie card, a Jeff Blauser rookie card, and a Steve Avery, just a regular Steve Avery card. I was like, “Listen, man, I don’t really want to give these rookie cards up. Mark Lemke, I love the guy, but I know you don’t like Barry Bonds. He’s flashy and you hate him and I get it. Just give me that Barry Bonds rookie card. I’ll give you these three really nice cards for that one.”
I think that was maybe a moment when it clicked for me in terms of analysis being part of why I like baseball. That went into fantasy baseball, and I played fantasy baseball for a long time before I even wrote. One of the things I wanted to do in order to win my fantasy leagues was read all the analytics out there. In order to win all my fantasy leagues, I read Baseball Prospectus. I read FanGraphs, I read Rob Neyer, and even though I was reading about baseball, I was really trying to figure out how to just win my fantasy league. There’s a little thing that some people don’t know, which is that FanGraphs was actually created by David Appelman for him to win his fantasy leagues.
So I think there’s a legit driver under the scenes sometimes of what you read in baseball. The analysis you see, that’s just people trying to win their fantasy leagues.
KC: We occasionally hear all the data out there is ruining baseball. How do you react to that?
ES: I’m sympathetic to the idea that someone who doesn’t want to read about data feels like maybe there’s too much of it and I can see that it’s become more pervasive. There’s more and more people using it in their writing. For that person, I would say that there’s still great analysis, still great opinion writing. There are still great columnists that are out there doing good work and they can read those people.
For me, the way that I appreciate baseball, I think the numbers only add to it. (You could say) “Oh, Mike Trout, he’s an amazing player and he’s a great combination of all these things.” That’s great, but if I can put together a number and say he’s actually the best of all time through (age) 27, I feel like it’s more compelling. You could say, “Oh, last night in the game, someone made an amazing catch where they ran to the wall and it’s the best catch I’ve ever seen.” Well, that’s just your word for it. You’re just telling me, “Oh, that’s one the best catches ever seen.”
What if you could say he ran faster than anybody has run on a catch this year and it had the lowest catch probability and he jumped the highest? We can do that now. We can tell you that he jumped the highest to get that ball. I find that more compelling. I think that just gives you a sense of context and a sense of where this belongs in history and I think that adds to storytelling. It doesn’t detract. It’s part of telling the whole story. I think numbers are part of telling the whole story right now.
KC: Sports Journalism – and journalism in general – has for a while now been in a bit of a rediscovery period. How have you as an individual taken that on?
ES: I didn’t actually consider that I would be a sports journalist growing up. Therefore, I’m not sure that I can speak on all journalists and where journalism has been and where it’s going. But for me, journalism is finding questions and answering them. If you don’t use data, what are you doing? Then you’re only relying on word of mouth. That’s the nice thing about data. It gives you another voice. I think a compelling question is the main feature of any good piece. Why is this player so good? Is this player going to keep being good? Does this matter in this baseball game? Those questions, I think, are the key to a good story.
In order to answer those questions, I don’t actually think that data is the only thing. I think you have to talk to the players because the players have their own perspective on it and sometimes the data has to catch up to what the players see. I think you want to talk to management. I think you want to get this idea from 10,000 feet about what’s happening, but if you don’t ask the data, you’re just missing out on a more objective way of analyzing and answering the questions. Yes, I think that you still need characters and you still need to deep dive into what makes these people at what they are.
KC: ‘Access’ in journalism has always referred to in-venue reporting, mostly for access to sources. Consistent access to data adds a layer to that. How do you weigh the importance of access?
ES: I think one thing that’s difficult for me is that I know that the teams that I talk to have access to data that I don’t have access to. That sometimes is a struggle for me because I know that they’re looking at the game slightly differently. They have more analysts on the subject and they have more data and different data than I have, so that sometimes is a part of the access that frustrates me.
KC: I think back to my pre-data query days in sports writing, and that writing process or level of detail seems so rudimentary now. It makes me appreciate how essential relevant data can be to the process. How would you describe the advantage of learning this stuff front and back the way you have?
ES: I got lucky. The way that I see baseball, the way that I digest baseball has become more popular over the years, I think. I was a numbers-first guy. I was a fantasy guy. Fantasy has gotten more popular, so the numbers have gotten more interesting. I think early on, the numbers were just back-of-the-baseball-card stuff that were maybe not as compelling. It was just RBI, runs – and how much more was out there? Now, we’ve opened the door with Statcast and with people being at the game and being able to convert what they see in the game into numbers as they do at Stats Perform. We can now answer more questions with it. We can now do so much more with the data and it’s so much more important to include that into your writing.
One of the first big advancements was Wins Above Replacement. That was an interesting thing because it allowed us to put together the disparate things that a player does on the field and put them into one number, but one of the things I do not like about that is it becomes a one-number thing.
I think that there’s not always a way to boil people down to one number, just as you can’t boil me down to just exactly how many clicks my last article got. I don’t want to be judged by one number either. I think that one-number stats have shown that they don’t always tell the whole story. That’s why we’ve gone away from WAR as much as an analytical community and started breaking it off into smaller questions with smaller answers and different stats that tell you something different. Now it’s a little bit more about: How fast is his sprint speed and how important is that? How fast is his jump on that catch and how important is that?
KC: Let’s discuss some of the more interesting stuff you’ve done more recently. What comes to mind for you, and what are some of the more meaningful metrics you’ve worked with?
ES: One of my favorite statistics I’ve worked with recently was one created by Stats Perform called command+. It answers a question that you would never been able to answer with traditional metrics. It requires a wide variety of analysts watching the game. It requires an ability to code what’s happening in every game and turn basically a bunch of research behind the game to tell you, “Did that pitcher do exactly what he wanted to with that ball?” That’s an extremely difficult question to answer.
It’s something that mostly analysts have stepped away from because they can say, “I can’t be in the pitcher’s head.” I think one thing — kudos to Stats Perform — is that they took a question they thought nobody could answer and tried a different approach with it, and really tried to get into the pitcher’s head and try to give them credit for shaping a curveball. It might be a ball, but it might be the shape that they wanted and in the general location they wanted – which I think is the true definition of command.
KC: You wrote a deep dive on the Marlins’ Zac Gallen. Also something on measuring the efficacy of a slider. Where do ideas like that come from?
ES: When I watch a game, I just have questions. I don’t know if it’s because of the way I came to this country and I had questions about this country. I was like: “What is this new place? What is this new game? This is an amazing new game. I love it.” So when I watch a game, often the announcer will say something like (this pitcher’s) changeup is good or it’s bad and I’ll say: “Why? What makes that changeup good? What makes this changeup bad? Can we put numbers on that? Can we observe some categories of what makes players good and bad?”
That’s, I think, what drives a lot of my writing. The way that I look at stats is trying to answer questions about the game. With Zac Gallen, I was just saying, he’s a player that’s coming over. He got traded for a top hitting prospect. He got traded away from a team that needed young players. Why did he get traded away, and what does Arizona see about this player that’s different than what maybe Miami sees about this player? I try to put together the different pieces and answer that question.
KC: Opinion and personality have been a part of baseball writing since the start. It may always have its place in sports writing, but how necessary is it for media to go a step further now and better support their work?
ES: I think that, generally, readers prefer to have a question answered in a way that they either recognize, either they’ve been in school and they’ve seen someone have to prove their answer in either a paper or with data in a similar situation, or in their workplace environment. If you give a presentation in a workplace environment and you don’t have data involved, what do you do? You’re not answering the question that you’re trying to answer. You’re not convincing anyone.
When I try to convince someone of something, I usually start with the data. If I’m going to write about how good or bad Manny Machado’s contract is and I just say that I think he’s generally a bad player and therefore the contract is bad, I think that leaves today’s readers short. They would say this is kind of opinion writing and it’s not as objective as we want. That’s one of the things that people want out of journalism that we’ve been chasing for a long time – objectivity. I think data really gives you an opportunity to be that objective journalist because you’re using data as an objective source in your pieces.
KC: What kind of impact would you say the proper use of data in a piece you write has on the level of engagement among readers? Would you say it also makes a difference with credibility among readers?
ES: It’s interesting. Sometimes I get yelled at for being a numbers writer. I think for the most part it allows them also to engage me sometimes. I get readers that will say, “Well, you’ve missed, you didn’t look at this number. You didn’t look at this number.” I find that to be a higher-level conversation than, “You just hate Manny Machado and he’s my favorite player, go away.” I think allowing us to have a conversation that’s more objective – that’s something that’s happening in baseball too.
You’ll see that today’s players are much more open to stats. Part of the reason is that they want to get better and they now can see that stats can make them get better. So now the conversation isn’t just, “Hey dude, you have a really bad slider, make it better.” Now the conversation is, “Hey, take that slider and can you change the spin axis on it? Can you put your fingers this way? Let’s read what the machine says. The machine is going to tell us if it’s good or bad and we don’t have to yell at each other.” There’s a little bit of that, I think, in sports writing and reading in general – the numbers allow you to give us something to argue about that’s more objective, but also not as subjective between us.
KC: We’ve talked off-camera about how we’re both, at least to an extent, into soccer. Soccer has traditionally been well behind American sports in terms of properly using data because it’s harder to meaningfully and objectively segment than a sport like baseball. That’s started to flip with machine learning and the ability to implement artificial intelligence to analyze massive data sets humans can’t structure on their own. It’s resulted in ‘expected metrics’ that are more predictive than anything that came before. Do you see AI having a similar impact on baseball in the future?
ES: Because baseball is such a segmented sport, you can stop it at any moment and say, “How many outs are there, where are the baserunners, are they on first, are they on second?” That allows for a lot of semi-pro research. That allows for people who just can download a data set to do some of their own research, and that’s actually fueled a lot of interest in baseball in terms of reading. Sabermetrics is based on this idea. SABR is the Society for American Baseball Research. These are semi-pro researchers that are just doing their research on their own, and a lot of those people have now stepped over into the private sphere and are working for baseball teams.
I think as the computing power goes past what you can do on your own, there’s going to be a shift. It might be a little bit more to what’s happening in soccer because there can’t be as many as enthusiasts that are going to do the soccer analysis because they might not have the computing power. What you need is a provider that can step in for you and help you with that analysis and provide it to you and not just provide it to the front offices. We need these professional sources of data that are also available to readers. We need these professional sources of data that are not just behind and working for teams to step in for the semi-pro researchers that used to be there.
One of the things that’s very interesting right now in baseball is that we’re going away from using radar technology to using optical technology … to describe to you what’s happening with the limbs. Radar technology couldn’t tell the difference between a ball and a bat and an arm. Now we’re going to have optical technology; it can tell you what the arms are doing, what the legs are doing, what the body is doing. I think what we’re going to have to understand now is a little bit more biomechanics.
I’m trying to do this. Already trying to read about biomechanics, trying to learn about biomechanics because you’re going to learn about optimal uses of the body. Then we’re going to be able to say more definitively things about where an arm should be, where the bat should be in a certain moment in the swing, and we’re going to have more data that relates to that sort of thing in the public sphere and in the private sphere. I think we’re going to be talking more about how bodies move in space.