SmartDrill SmartDrill
Mission & Clients
Examples
Case Studies
Tips & White Papers
Analytic Techniques
Data Mining Links
Contact Us

Creative Data Mining

Previous | Index | Next

Cross-sell modeling: identifying the next product to sell from a product array

The following example demonstrates that, sometimes, a simple or "low-tech" methodology can yield more useful results than a sophisticated or "high-tech" methodology. Sometime ago, we were approached by a financial services provider who had a large array of products and services that could be cross-sold to existing customers. The in-house analytic staff wanted to use a sophisticated multivariate analytic technique called factor analysis to determine, for any given customer household, what product or products should be marketed next.

Factor analysis examines patterns of correlation among variables, and can be used to better understand what variables tend to be associated with some variables, while at the same time not being associated with other variables. Factor analysis has been used productively for many decades, and has many wonderful applications.

Among its many fruitful applications, factor analysis can be used to analyze data from a market research study which may include a large battery of attitude items. By applying factor analysis to the set of items, one can go beyond the individual items to identify important conceptual dimensions underlying the item set which may be more actionable than the individual items themselves. For example, a battery of 20 attitude items measuring the importance of various product or service benefits for a retail category might reveal, via factor analysis, that there are really four underlying key benefit dimensions, such as price, quality, service and convenience.

The factor analytic technique identifies patterns of correlation among the attitude items that would be difficult for an unaided human to discern by examining reams of crosstabulations or a large correlation matrix. In this example, it identifies groups of items that tend to be answered the same way (strong agreement with one attitude item tends to be associated with strong agreement with other items in the factor, but not with items from other factors). Factor analysis generates the pattern of underlying factors, and assigns coefficients (factor "loadings") to each variable for each factor. These loadings may range from zero to +/- 1.00 (just as simple correlation coefficients do). A variable with a high positive loading on a factor is strongly associated with the underlying factor; a high negative loading indicates that the variable is inversely correlated with the factor; and a loading near zero indicates that the variable is unassociated with the factor.

In addition, respondents to the market research survey can be assigned factor scores (by multiplying the respondent's scale score on the original attitude item by the item's factor loading), and further analysis can be used to identify groups of respondents that are relatively price-conscious, quality-conscious, convenience-oriented, and service-oriented. By relating these respondent groups to the demographic items and other survey items, a very rich picture can emerge, and this picture can be valuable to retail planning staff as well as advertising agency creative staff, media planners and others.

Our client's analytic staff therefore felt that such an approach could be useful for trying to understand cross-sell opportunities: by examining the results of a factor analysis performed on product/service usage, one might be able to determine what product to sell next to any given customer household, based on what arrays of products tend to be found in the same households. And while at first glance this exercise might seem worthwhile, there are some serious drawbacks to such an approach.

One problem is that the factor analysis as described above uses current product/service ownership as the starting point. But what we really want to do is identify households for which a particular product or service is appropriate as the next product to sell to that household. Therefore, an effective cross-selling model should use recent product or service purchase behavior, instead of current product or service ownership, as the criterion behavior to be predicted. But factor analysis is not well geared to this task. It is good at identifying current ownership of product or service arrays, but not at identifying which single product should be sold next.

Another drawback of this technique is that one cannot readily apply the results of the factor analysis to new or prospective customers. For example, if one bank acquires another bank and assimilates the other bank's customer base, the factor analysis cannot be applied readily to "score" the new customer base. The safest way to apply factor analysis to the new customers would be to start over again and either factor analyze them separately, because they may well produce a factor structure that is different from the existing factor structure; or perform a re-analysis which includes both the existing and new customers. This process becomes cumbersome.

There are other problems associated with trying to use factor analysis for cross-selling, but we will not cover them here. Instead, we will describe a much simpler approach, which not only identifies cross-sell opportunities, but also allows new customers to be scored from the existing targeting model as they are acquired. Thus, once the model is created, it has broader application and a longer life than would be the case with factor analysis.

The approach we recommend is to develop a separate targeting model for each possible product or service type. This would be a predictive model because it would use recent product/service acquisition as the dependent variable. It would use other product usage and syndicated demographic and lifestyle data as predictors. Any of a number of analytic techniques could be used, including a regression-type model, a classification-tree approach, or a neural network analysis.

Here we might recommend a classification-tree technique, because it not only generates a predictive model, but also provides us with a clear picture of the market structure for targeting a particular product. For example, a nominal CHAID segmentation model could employ as a dichotomous dependent variable a categorization of households as either having purchased or not having purchased a particular product in the previous six months. By using pre-existing product ownership, demographics and lifestyle variables as predictors, we can generate not only a powerful scoring model for cross-selling, but also a richer understanding of why a household's characteristics indicate that the household is a good candidate for targeting.

Each household is scored on each model separately, using a separate model for each product or service. Then the model scores are ranked from high to low within each household. And the next product to sell to a particular household is determined by the highest scoring product from among the array of candidate products that the household has not yet acquired.

As we said earlier, it is a relatively simple procedure, but it turns out to be much more effective and easy to implement than some of the more sophisticated approaches one might try to use. The moral is that simplicity is sometimes better than sophistication.

Previous | Index | Next

 


Copyright © 1998-2008 SmartDrill. All rights reserved.