Creative Data Mining
Previous | Index
| Next
Cross-sell modeling: identifying the next product to sell from a product
array
The following example demonstrates that, sometimes, a simple or "low-tech"
methodology can yield more useful results than a sophisticated or "high-tech"
methodology. Sometime ago, we were approached by a financial services
provider who had a large array of products and services that could be
cross-sold to existing customers. The in-house analytic staff wanted to
use a sophisticated multivariate analytic technique called factor analysis
to determine, for any given customer household, what product or products
should be marketed next.
Factor analysis examines patterns of correlation among variables,
and can be used to better understand what variables tend to be associated
with some variables, while at the same time not being associated with
other variables. Factor analysis has been used productively for many decades,
and has many wonderful applications.
Among its many fruitful applications, factor analysis can be used
to analyze data from a market research study which may include a large
battery of attitude items. By applying factor analysis to the set of items,
one can go beyond the individual items to identify important conceptual
dimensions underlying the item set which may be more actionable than the
individual items themselves. For example, a battery of 20 attitude items
measuring the importance of various product or service benefits for a
retail category might reveal, via factor analysis, that there are really
four underlying key benefit dimensions, such as price, quality, service
and convenience.
The factor analytic technique identifies patterns of correlation among
the attitude items that would be difficult for an unaided human to discern
by examining reams of crosstabulations or a large correlation matrix.
In this example, it identifies groups of items that tend to be answered
the same way (strong agreement with one attitude item tends to be associated
with strong agreement with other items in the factor, but not with items
from other factors). Factor analysis generates the pattern of underlying
factors, and assigns coefficients (factor "loadings") to each
variable for each factor. These loadings may range from zero to +/- 1.00
(just as simple correlation coefficients do). A variable with a high positive
loading on a factor is strongly associated with the underlying factor;
a high negative loading indicates that the variable is inversely correlated
with the factor; and a loading near zero indicates that the variable is
unassociated with the factor.
In addition, respondents to the market research survey can be assigned
factor scores (by multiplying the respondent's scale score on the original
attitude item by the item's factor loading), and further analysis can
be used to identify groups of respondents that are relatively price-conscious,
quality-conscious, convenience-oriented, and service-oriented. By relating
these respondent groups to the demographic items and other survey items,
a very rich picture can emerge, and this picture can be valuable to retail
planning staff as well as advertising agency creative staff, media planners
and others.
Our client's analytic staff therefore felt that such an approach could
be useful for trying to understand cross-sell opportunities: by examining
the results of a factor analysis performed on product/service usage, one
might be able to determine what product to sell next to any given customer
household, based on what arrays of products tend to be found in the same
households. And while at first glance this exercise might seem worthwhile,
there are some serious drawbacks to such an approach.
One problem is that the factor analysis as described above uses current
product/service ownership as the starting point. But what we really want
to do is identify households for which a particular product or service
is appropriate as the next product to sell to that household. Therefore,
an effective cross-selling model should use recent product or service
purchase behavior, instead of current product or service ownership, as
the criterion behavior to be predicted. But factor analysis is not well
geared to this task. It is good at identifying current ownership of product
or service arrays, but not at identifying which single product should
be sold next.
Another drawback of this technique is that one cannot readily apply
the results of the factor analysis to new or prospective customers. For
example, if one bank acquires another bank and assimilates the other bank's
customer base, the factor analysis cannot be applied readily to "score"
the new customer base. The safest way to apply factor analysis to the
new customers would be to start over again and either factor analyze them
separately, because they may well produce a factor structure that is different
from the existing factor structure; or perform a re-analysis which includes
both the existing and new customers. This process becomes cumbersome.
There are other problems associated with trying to use factor analysis
for cross-selling, but we will not cover them here. Instead, we will describe
a much simpler approach, which not only identifies cross-sell opportunities,
but also allows new customers to be scored from the existing targeting
model as they are acquired. Thus, once the model is created, it has broader
application and a longer life than would be the case with factor analysis.
The approach we recommend is to develop a separate targeting model
for each possible product or service type. This would be a predictive
model because it would use recent product/service acquisition as the dependent
variable. It would use other product usage and syndicated demographic
and lifestyle data as predictors. Any of a number of analytic techniques
could be used, including a regression-type model, a classification-tree
approach, or a neural network analysis.
Here we might recommend a classification-tree technique, because it
not only generates a predictive model, but also provides us with a clear
picture of the market structure for targeting a particular product. For
example, a nominal CHAID segmentation model could employ as a dichotomous
dependent variable a categorization of households as either having purchased
or not having purchased a particular product in the previous six months.
By using pre-existing product ownership, demographics and lifestyle variables as predictors, we can generate not only a powerful scoring model for cross-selling,
but also a richer understanding of why a household's characteristics indicate
that the household is a good candidate for targeting.
Each household is scored on each model separately, using a separate
model for each product or service. Then the model scores are ranked from
high to low within each household. And the next product to sell to a particular
household is determined by the highest scoring product from among the
array of candidate products that the household has not yet acquired.
As we said earlier, it is a relatively simple procedure, but it turns
out to be much more effective and easy to implement than some of the more
sophisticated approaches one might try to use. The moral is that simplicity
is sometimes better than sophistication.
Previous | Index
| Next
|