The Data Pitcher


It seems that there is only one answer for this: winning. But for a Major League Baseball team, performance is much more than that. Winning is just a part of it. Marketing, advertising, media revenues, all contribute to a team’s idea of performance. And winning in itself is not only about the spectacular home runs but also about the efficiency of transforming more hits into runs or how a team can reduce the number of strikeouts and increase its hits.

Winning - and not the devil - is in the data (going far beyond details). And this is why the BIME team has set itself the quest to discover the apparent secrets (not anymore) of success in the Major League of Baseball. And, in the meantime, reveal how the KC Royals made the playoffs. Our first hunch as fans was the play of our star’s - Alex Gordon and Salvador Perez. But as data scientists, we discovered that is much more than that: starting from their amazing batting average, we started to dig into the data analytics of the team’s performance and we created… the ultimate MLB dashboard.


If one has a look at the geomap chart showing the number of wins of each team compared to the amount of fans per game, he will notice that winning does not immediately attract increased popularity. It is interesting to observe, for example, that KC (Kansas City) has a higher population within the city than STL (St. Louis) and the teams have a similar number of wins yet STL fans fill 99% of the stadium each game while KC only fills 64% of their smaller stadium (and yes, we geocoded the stadiums to their exact locations so take a sneak deep peek!) More than that, maybe not a league rivalry but a marketing one, one can notice that even though SF (San Fran) and OAK (Oakland) have stadiums right across the bay from each other and the exact same # of wins, SF fills a much higher percent of their stadium each game!

But these may well be not the only surprising discoveries. Have a look at the chart showing the average attendance compared to the percent of stadium filled for each League/Division. Click on the bars to dive into the data and see the attendance stats for each division. Why do divisions like the NL Central have the best average attendance/% filled? Answer all of these questions yourself by interacting with the dashboard and discover more yourself.

Check out our ultimate MLB dashboard.


The odds were against him. His last season, his last chance to shine- and he delivered in the most beautiful fashion during his last at bat in Yankee Stadium. But if you look at the top right chart depicting the  batting average/at bats metrics by team and click on NYY, the chart will show its 10 best players and, surprisingly or not, Derek Jeter is there with the highest number of at bats. But not the highest batting average. Yet again, his legend goes sometimes beyond his stats. Click on any of the teams to discover its performers and discover how a group of performant players can become a winning team.

In the same manner, dive into data to understand how the efficiency of increasing the hits / strikeouts ratio can be seen from the division to the team and players’ level. Is the fact that AL Central had the most number of hits in the regular season going to decisively count in the postseason? Which team from AL Central contributed the most to this dominant position?


One would think that if the teams get more hits the chance that they hit home runs would go up but that is not the case (at least for this season!). The evidence is in the data: the division with the highest number of hits did not have the highest number of home runs. But do not stop at the division level. Discover trends as well as heroes who contradict the status-quo. If you click a specific division in the Top 10 Home runs / Hits charts, you will see the top 10 HR hitters within - click AL East and you will find Nelson Cruz - the man who had not only the highest hits but also the most home runs for his division. Once again, data is for dashboards and players are for posters.

Some teams have this kind of star players. They seem to do less and still win more. How is that possible? Have a look at the bubble chart making the correlation between wins and runs for each team. It is interesting to see that teams like LAA (LA Angels), (BAL) Baltimore, and WAS(Washington) all have similar amounts of wins but a pretty big difference when it comes to the amount of runs that they score. It shows that yes, there is an upward trend when it comes to runs/wins but there are other factors (like pitching!) that enable teams to win but not score as many runs because their defense is not letting the opposing team score as much.

It’s time for the data stream to stop for just a little while. We are back tomorrow to analyze the stats of an amazing win going to happen today. Go KC Royals!

Check out our ultimate MLB dashboard.