What’s interesting about ensemble modeling is that it’s various models all combined together. Once each model is built, they all pull out different intricacies from the data. Some are better at non-linear trends while others perform well with linear-type trends. Let’s use football as an example. If you look at pro football web pages, that’s highly indicative of you signing up for fantasy football. Age, on the other hand, is not the best linear indicator. Those on the far ends of the curve, either very young or very old, are much less likely to sign up than age groups in between.
By utilizing different types of algorithms, we’re pulling out different types of complexities. By combining them, we make a more robust prediction that will be more accurate than any one single model.
Any kind of model can go into an ensemble model, but we will take a closer look at a few:
Logistic Regression – Logistic regression models a relationship between a dependent and one or more independent variables. In the case of binomial logistic regression, the dependent variable can only have two outcomes (customer either signs up or doesn’t sign up for fantasy football). Logistic regression is used to estimate the probability of an outcome. A possible drawback is variable interactions where two explanatory variables give the same information. For example, the number of one-week football page views provide much the same information as two-week football page views. Dealing with variable interaction and satisfying assumptions to keep the methodology valid take time and effort to produce a quality model.
Random Forest – Random forest is a combination of decision trees that are run together. The process works by taking a random percentage of the data (20% for example), and a decision tree is built. Then a new 20% sample is taken and another decision tree is built. This continues for as many iterations as specified. The results are averaged to get the final model, but not every decision tree makes the cut. Only decision trees with the most accurate results are retained. This is a popular methodology of Quaero consultants due to its few assumptions, ease of use, and quick build time compared to other modeling techniques. This is one of the more popular algorithms used in online Kaggle competitions along with varying machine learning techniques.
A decision tree is basically a visual representation of an algorithm. For example, if you start with 1,000 customers, you can break down that number into those who opened an email and those who didn’t to find higher or lower proportions of successes. After a split occurs, as you can see in the visual, the variables are tested to see which variables provide the next split on each subset.
Gradient Boosting Machine (GBM) – This is a sequential regression model. It works by creating weak predictions, usually with decision trees. Once a model is constructed, the algorithm scores the results and determines what it predicted wrong and what it predicted right. Based on results, adjustments are made and another model is created. So each iteration “learns” from the previous iteration to reduce error.
Neural Network – A neural network is another regression-type model. As the name implies, it is loosely inspired by the brain. Neural networks are made up of layers of interconnected ‘nodes.’ Information and patterns are fed into an input layer, but the actual processing of the information is done through hidden layers. After that, the information passes to the output layer. Neural networks learn from examples.
Another model worth mentioning is Support Vector Machines (SVM). SVMs are regression models that analyze data and recognize patterns. A drawback is the difficulty to build and have many “knobs and levers” to tweak during the modeling process that take time and effort.
Ensemble modeling is time consuming since several models are developed versus only one. But imagine that out of 500,000 people, you made an accurate prediction for just an extra 1% signing up. That’s 5,000 additional people who are going to your site and driving more revenue and ad views while targeting the exact same number of customers. What does 5,000 extra customers mean to you?
You can find the first part of this blog here.View all Blog Posts