Ensemble modeling is not new. Random forest modeling has long been a favorite methodology of Quaero consultants as the prediction method of choice. But as with anything, there is always room for improvement and ensemble modeling offers advantages over any single one model. Ensemble modeling is basically combining multiple modeling methodologies together. In fact, random forest models by definition are an ensemble of decision trees.
Companies are just now starting to realize the tremendous benefit of predictive analytics. They’re beginning to look into modeling and, more importantly, starting to figure out how best to model. Utilizing an ensemble type of methodology is as cutting-edge as it gets currently. While others are just beginning to use analytics, in this way Quaero is ahead of the game. By supercharging our predictions, we are setting a benchmark.
Developing unique modeling techniques is what the industry is moving towards. The basic idea is that if one model works well, several models will work even better. Let’s look at a fun example from our Quaero holiday party. We had a “guess how many bows” contest. A number of bows were placed in a large glass jar and employees guessed how many were in there. After everyone submitted their answers, we took the average of all the guesses. The correct answer was 47. Guesses ranged anywhere from 25 all the way up to 100. The average of everyone’s guesses was 48.
This is a very basic, but effective example of how an ensemble model is better than any one single model, or prediction in this case.
Predictive modeling can be used in a wide variety of scenarios and situations. At Quaero, one way we utilize predictive models is for targeted banner advertising online. Taking advantage of a model, we can more accurately serve ads to the intended audience while also maximizing the likelihood of a response, whether that be a click or another action. As shown in the Model Lift Charts plot, the random line (in blue) represents a pure guess with no outside information. We can expect to correctly guess 30% of successes given 30% of the population. Where the models provide their benefit is in the additional lift we can see across the targeted deciles. In the top 3 deciles of the ensemble model, we can expect to capture 81% of all successes compared to only 30% for a pure guess. This indicates a lift of 51%.
A real, applicable example is a marketing campaign. There will be those individuals who will buy your product regardless of whether or not you market it to them and there will be those who will never buy your product. Does it make sense to spend money on these two groups?
If you send out a campaign to people at random, you can expect to get a certain return on your investment. An uplift model helps you target the individuals who are most likely to be persuaded to buy your product. So instead of targeting 100%, you can target 30% and get a 60% return on investment.
The chart above indicates just a few examples of the models we have built. Stay tuned for the second part of this blog for more details about how we supercharge our models.
You can find the second part of this blog post here.View all Blog Posts