A Modeling Process
by Billy Caughey
Introduction
Have you ever been called a data wizard or witch doctor? I have! Let me give you an example of this. I used to work with a domain expert who had a jokester and jovial side. On one occasion he asked for a very sophisticated data analysis. He jokingly said, "Billy, don't you just have a Stata button you can push and produce the results?". We both laughed because I knew he was joking; however, that moment has stayed with me. This domain expert had some knowledge of statistics and knew this wasn't how it worked. Unfortunately, I would argue for ever one domain expert or business leader that knows how difficult analytics can be, there is at least 10 who do not.
So, what in the name of wizardry is the secret? It's really quite simple. It's knowing a modeling process that can structure your efforts. A modeling process can help put everything into perspective, define timelines, and help understand how to present results back to leadership.
A modeling process
Now that I have destroyed the magic, let's decompose the trick. I'm going to be using the modeling process as presented by Albright, Winston, and Zappe in their text, "Data Analysis and Decision Making". Now, I'm using their 4th edition book, but the 7th edition is the most recent release. Albright et al present a seven step modeling process. I will give a short explanation of each step.
1. Define the problem
Makes sense, right? Chances are you aren't going to chase down answers unless there is a stated problem. Before you get excited, there are two key parts to this step: a business problem and an analytics question. When a business leader comes to you, they are concerned about a problem. It becomes your responsibility to understand the problem.
Once understood, then you translate the problem into an analytics question. There are a lot of little details you have to manage in this translation. Now, you with an analytics question in hand, you can move on to the next step.
2. Collect and summarize data
Get ready to dig in because this is the most tedious part. You have a question you can solve for - great. Now, you have to collect the data needed to solve the problem. This starts the intense process of pulling and cleaning data. That process might look like accessing a few tables in a database. It also might look like surveying a customer focus group. Any way you need to do it, you have to collect the data.
With data collected, the hard part is over right? Nope! You have to summarize the data. This means you are breaking out summary statistics and visualizations. Understanding what you have in the data is important and summaries are the most efficient way to do it. I will warn you at this point that summarizing data is as much of an art as it is a science. This means use appropriate methods that your audience will understand.
The data is collected and summarized - score! Now, let's start analyzing!
3. Develop a model
You've made it this far and now you have to figure out what it all means. You can pair your analytics question and data summaries with an appropriate model. I use appropriate model very loosely and not always pointed at a sophisticated statistical model. This is why your paring has to be a thoughtful one.
This means your model might be something like a dashboard with a scheduled data refresh. It also means your model could be something advanced presenting significant projections moving forward. Whatever the best model is, make sure it is the model your business leaders will respond best to.
4. Verify the model
What does this mean? It means you are going to use your model to test an assumption and see if the result jives with you. For instance, if your model is a dashboard model, you will need to check your results against what the company believes is true. That check might be against a previous version of the dashboard or against business leaders intuition. For the more advanced models, verifying the model means plugging in a value and seeing if the result makes sense.
We will discuss more on what comes next if the results don't make sense in a later blog.
5. Select one or more suitable decisions
You haven't done all this work to do an academic exercise of data analytics. You are hear to make a decision. Based on your results from the model, answer the question! Maybe you have several answers based on various situations. In the end, make sure you are trying to answer the question.
6. Present the results to the organization
Wait... you didn't think you were going to put in all that work to not present on it? At this point, you know more about the problem, methods, and answers than anyone else. You have become the expert! Everyone in the company is looking to you to provide the answers they are looking for.
There is one drawback. Remember how I said business leaders don't always know data analytics? Yeah, this is where it can bite you in the butt. You have to translate your analytic answers into business solutions for business leaders to follow. As a colleague of mine once said, "no one cares about your p-value." With that said, they will care about your solutions!
7. Implement the model and update it over time
This is where you see the actually fruits of your work. The model is implemented and the solutions you discovered are applied. I have had several methods put into practice over the years and there really isn't many things more satisfying than that! It's essentially like getting brownie points. It's very nice.
Unfortunately, this doesn't mean the work is over. Models are like food on the shelf. You have to check if it's expired or not. Pulling the model and testing it is critical to understand if the model still holds. I have heard of models which have shelf lives of years. I have also heard of models that have shelf lives of weeks, or even shorter - days! No matter what the shelf life, you have to make sure your model still holds. If it doesn't, it's time to build another one!
That was a lot...
If you are feeling like I was short on the details, you were right. I could spend a whole series on this process and share some thoughts and insights. In fact, that is what I am going to do. Over the next 8 blogs, I am going to dive into each step and break it down further.
This process can be a long magic trick, but with practice you will get it down. Once you get it down, I hope you get called a data wizard! It's quite the distinction to be called a data wizard.
Comments
Post a Comment