Where did I put that tool...

by Billy Caughey

Introduction

In a previous blog, the analytics tool belt was discussed. Common tools were discussed and some rationale was given as to why you should have those tools. What I have come to realize is even if a tool is commonly used, there are times I still don't have it in my tool belt. I will be the first one to admit I have had to go looking for a tool in the hopes of adding it to my tool belt.

In those cases where you don't have a common tool, what are you supposed to do? Where are you suppose to look to find a common tool? When do you know you've added the tool to your tool belt? Let's explore these questions a little more together.

Where can you pick up the tool?

I wish there was a magic answer I could give you here, but there isn't. There are too many options to list where you can pick up a tool. This shouldn't give you any stress. In fact, this should excite you. There are plenty of analytics "tool stores" for you to shop at! Here are the two I typically use.

Books

While in grad school, a few classmates and I were struggling with a problem. We had searched the internet for several hours trying to find anything that could help us produce a solution. Around the 3 hour mark of our search, my adviser came in the student lounge and asked what we were doing there. After we explained what we were doing, he laughed and said, "There are these ancient things called books. They are found in the library. You know, the building right over there. Maybe you should go look at a few." 

I can't impress upon your mind the value of a good book. A good book will present key learning points, supported with examples, and how you can apply them. When I was trying to refine my analytics and programming in R skills, I ran into the book "An Introduction to Statistical Learning" written by James et al. This book helped me refine my R skills by reminding me of the math behind the tool, the R code to run the basic code, labs that took me a bit deeper than chapter examples, and problems that I could work through. 

When you run into a situation where you need to pick up a tool, make sure to pick up a book! It can be a source for you to reference throughout your career. There are several books, like James et al's book, I reference all the time.

Online Learning

For some, picking up a book is nice, but ultimately not the best way they learn. This is completely fine and normal. Some don't learn new things unless it is in the classroom setting. Enter MOOCs: Massive Open Online Centers. You may not know what the acronym of MOOC meant, but I'm willing to bet you've heard of DatacampCoursera, Udemy, edX, and Udacity. These are all MOOCs. They have teamed with universities and domain experts to present a platform for the working professional to pick up new skills and sharpen existing skills. If there is a skill you are trying to pick up or sharpen, I'm willing to bet there is an MOOC that will have it.

Before I move on, I want to beat a question to the punch: which platform is better? If you do some searching, you will find opinions, reviews, and lots of commentary on each one. What does this mean? It means find the one that works best for you! I have taken courses for a lot of MOOC platforms and been happy with all of them. Do I have my favorites? Sure. Am I going to share them right now? Nope! I want you to go explore which one is best for you. 

When do you know the tool has been added to your belt?

My bar for knowing when a tool has been added to your belt is you develop muscle memory for using that tool. I'll illustrate this by giving an example in R using the faithful data set. This data set contains information on Old Faithful in Yellowstone National Park. There are two fields in the faithful data set: eruptions and waiting. Eruptions contains the number of minutes an eruption lasted and waiting is the number of minutes between each eruption.

When I was first learning R, I used what is called the base plots. These plots are built in plots which do not require any additional library. So, when I wanted to plot the data in the faithful data set, I'd use the base plot option and get something like this.

Plot 1: Base Plot of 'faithful' data set in R



This plot looks nice and when I need to look at something really quick, I use the base plots. When I am turning something over to a business leader though, there are better visualization libraries to use. One of them being Hadley Wickam's "ggplot2". This library allows the user better visualization options with more flexibility and creativity. Plus, this library works with all sorts of other libraries the base package struggles to work with. So, creating the same plot, but using "ggplot2", I get something like this.

Plot 2: ggplot plot of 'faithful' data set in R

With a few additional tweaks, I have a plot that I would be happy to turn over to a business leader. In both cases, the code was really easy for me. I was natural to produce these plots with minimal friction on trying to remember. So, by the definition I gave, I would have these tools in my tool belt.

I want to make something clear really quick - I didn't just pick up the library "ggplot2" one day and know everything about it. It took time, effort, and continual practice. I had to download the ggplot2 cheat sheet and refer to it a lot. I took MOOC's and read books on visuals in R.

One final piece of advice...

Be bold enough to share your tool belt and practice you are doing to improve your tool belt. If you learn how to do something cool, no matter the language or platform, share it! Put yourself out there to show the world what you are learning. This presents the environment of practicing and presenting what you are learning. There is even the potential someone more sage than you will provide constructive criticism. Welcome that stuff with open arms! That sage person was just like you once. They are trying to pass off what they know to help you learn and develop.

Conclusion

Go learn and share what you learn about a new tool! As Ms. Frizzle from "The Magic School Bus" would say: Have fun, make mistakes, and get messy!


PS - for those wanting to see the R code I used for these plots, you can find it on our analytics fun github.

Comments