I’m gonna do something different for this one. As I start growing my Tableau skills, I’m going to start splitting my posts into two parts with the actual viz in the middle. Top part will the context, the meaning, the thought behind why I created the viz. Then after the viz, I’ll dive into some the guts of Tableau and give you some technical nuggets of the viz.

So without further adieu…

Basketball has the half-court buzzer beater, Football has the kickoff or punt return for a touchdown, NASCAR has the photo finish,  while Soccer and Hockey have the shootout. Each of these events in their respective sports are the most exciting moments of all. And baseball, America’s pastime, has the home run.

A home run can be the energy jolt your team needs to make a losing situation an improbable win. And that’s what this dataviz is all about.

The Source

A fellow named Greg Rybarczyk started tracking home runs in 2005, what made him stand out was that he was capturing vital metrics of each home run including true distance, max height, speed off bat, as well as elevation & horizontal angle. After a couple years, ESPN took notice of Greg ended up licensing his website to ESPN Stats & Info who has been maintaining the site since 2010. 

The Why

For as long as I can remember, I’ve been a baseball fan. A large amount of the credit has to go to my dad. You see, from the time I was in kindergarten and all throughout junior high school he would get me out of the school and take me to the Royals Home Opener. And it would just be the two of us. We’d always go to lunch beforehand and I’m proud to say that I’ve been eating at Arthur Bryant’s on Brooklyn Ave. since I was very young lad!

You see, I live in Kansas City and I am a Royals fan. That means that these last couple years have been the most exciting years I can remember. When I was much younger I used to go to games in the summer with my friend, DJ and we would get there early to get players’ autographs and be in the outfield during BP to shag home runs and/or foul balls.

Baseball is arguably, the best sport there is. Sure, there’s controversy but what sport hasn’t had that. Something that can’t be denied is the special bond between being a kid and watching and/or playing baseball. And the most spectacular moment in baseball is the home run.

When I found this data set, I knew instantly I had to do something with it. And here’s what I came up with. Enjoy!

HomeRun

The Tableau Guts

If you’ve reading this, prepare for some technical Tableau aspects of how this viz was created. No harm, no foul if you stop reading now, but if you want learn some Tableau logic, please, by all means, read on. I’m going to break this down by each story point.

Shaping the Data

Before getting started in Tableau, I knew I was going to want some additional fields:
  • League
  • Division
  • Home/Away
Once I had the had those three additional fields I was able to create some IF statements to determine if the game type.
  • Intraleague
  • Divisional Rival
  • Interleauge
Because I the date of each home run, I was able to break down the season by regular season vs postseason. In fact, event though I didn’t end up using the fields, there are fields that exist for each round of the postseason for each league:
  • Wild Card
  • Division Series
  • Championship Series
  • World Series

The Ballpark Map

So, I felt like this was an obvious choice to start things off. What ballpark has had the most home runs hit? The first thing I needed to do was find the latitude and longitude of each MLB stadium. Luckily, thanks to Google, that dataset already existed.
So I downloaded it and joined it to my HR dataset on the Team field. Then added those lat. & long. measure fields to the map view with team on detail shelf and League on color.
For the filters, I put each dimension on rows then I put the count of HRs measure on size and set the chart type to bar. You will see this throughout the viz.

Team Rank over the Years

The wonderful thing about Tableau and Tableau Public is the community. Without the community of fellow vizzer and bloggers, the learning curve would be significantly longer. Personally, I have learned so much by reading others’ blogs and downloading their visualizations to reverse engineer their concepts for my own ideas.
For this view, I got the idea from Matt Chambers‘ viz that looks at the 2014 AP College Football Top 25. I downloaded his workbook and saw how he did it and made mine look similar to his with a few changes. Namely, instead of boxes for each team, I downloaded each team’s logo and used them as custom shapes. I did this because I knew that more than one team could have the same rank for any year.
To add context, I mirrored that view but only showed the World Series champs for each season.

The Scatter plot

This was the most challenging dashboard of the four and also the most detailed. I wanted to find a way to show off and utilize each of the measures that ESPN captures for each HR:
  • True Distance
  • Apex Height
  • Elevation Angle
  • Speed off Bat (Exit Velocity)
  • Horizontal Angle (this is used to determine field position)
In data viz, a cardinal rule regarding measures is if you want to compare two measures against each other, you should immediately think “scatter plot”
When I was exploring the data, my initial question was how does height correlate with distance? Common sense might say that “the higher it goes the farther it goes.” But then physics might try and say “common sense is correct but there comes a point where you can either go high or go far but you can’t do both.” That led me to explore distance & elevation angle or elevation angle & exit velocity.
During my exploration, i was creating 4+ scatter plots, one for each new question. Then a light bulb went off – “PARAMETERS!” Surely I thought there’d be a way I could set up some parameters to let the user select their own scatter plot.
A quick google search led me to Andy Kriebel‘s AMAZING resource blog where he did just that.
Next came the tooltip, this was the culmination of the relevant fields of the data in a cohesive message. And because the dataset provided a video link for each home run I made a URL dashboard action.

The Holidays

Everyone always says that baseball season “is a grind.” It sure is. It’s a full 6 months and change and there are several holidays sprinkled in. So I wanted to show how many were hit on those holidays. Then I wanted to show what player(s) is most successful hitting home runs on holidays.
I googled the date of each holiday over the past 10 years and created a IF/OR calculated field for each.
Then I thought, “who’s hit the most home runs on my birthday? It’s in the summer.” I wanted to make the dashboard as interactive and personal for the end user as possible, so I created two parameters and used the values to created custom dates in a calculated field.
Then to calculate the number of home runs, ran an IF statement
 (Yes, my birth date is the default date on the dashboard).
So there you have it. I hope you enjoyed reading this post as well as playing with the viz. I certainly had a blast making it.
Until next time!
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s