SmartDrill SmartDrill
Mission & Clients
Examples
Case Studies
Tips & White Papers
Analytic Techniques
Data Mining Links
Contact Us
No-risk offer

Tips and White Papers

Here we will be providing brief tips about data mining, as well as some white papers covering particular topics in more depth. Check back often, as we continue to add material to this new section.

bullet1.gif (163 bytes)
White Paper:
Using multiple modeling techniques on the same data set

Tips

For a more detailed discussion of this and other related topics, please read our longer White Paper about Creative Data Mining


Using Syndicated Data

Many people who buy syndicated data to overlay on their proprietary databases try to use a variety of syndicators on a rotating basis, in order to keep each syndicator "hungry." But is this always the best strategy? Some syndicators' data are actually better than others (e.g, in terms of having less missing data), particularly for certain product or service categories. (Although, in the interest of fairness, we will not single out any one syndicator here.)

And if you use the same syndicator over a period of time, you can often get discounts on overlays. They also get to know you better, they can be more helpful with their advice, and they tend to be quicker to fix errors and go the extra distance for loyal customers.

Also, if you do not yet have a large marketing database up and running, but could benefit from data mining to help you with market segmentation, targeting or list selection, there are several creative alternatives you might want to investigate. For example, some of the same survey-based syndicated services that supply the data used by market researchers or media researchers and media planners, will also sell you record-level data (i.e., household- or respondent-level data records) for limited research usage purposes.

You can buy record-level data from national surveys for just your industry, and these data often include a great amount of detail on your brand as well as competitor's brands. In addition to category and brand usage data, you get the demographic data which they routinely collect from the same respondents that provide the survey data on product usage.

This is almost as good as having your own in-house marketing database, and at a fraction of the cost. (And, in some respects, these data can actually be better than data in a proprietary marketing database, because you get information on competitors' customers as well as your own.) The data are available in formats that allow easy importing into statistical and data mining analytic packages, for analysis by either your own staff or outside consultants.

Other syndicators that run ongoing omnibus panel surveys, often used by market researchers as a cost-efficient alternative to customized tracking or market definition studies, are another option. As with the media data syndicators, you can get detailed demographic and lifestyle data, but you can also add some customized, proprietary questions to the mail or telephone panel survey to suit your needs.


Advanced Data Mining of Proprietary Market Research Data

If you have proprietary data from a large-scale market definition or tracking study, you can often use these data for advanced data mining. Many times, the research suppliers who conduct, tabulate and report the results of these customized, client-proprietary research studies base their research report on simple banner-and-stub crosstabulations of the data. (If you have read the Analytic Techniques section of our web site, then you already know how limited, and even potentially misleading, such simple bivariate analyses of data can be.)

By re-analyzing these studies, usually for a fraction of their original cost, you can extract much richer and more actionable knowledge than you got with the original analysis and report. SmartDrill staff have performed many such re-analyses of data that was just collecting dust, and have wowed clients with the new understanding gleaned from these studies.


Using Advanced Data Mining Techniques to Create a Bridge Between Proprietary Market Research Data and Large-scale Geo-demographic Targeting Analysis

Did you know that with advanced data mining techniques your existing survey research data can often perform double duty as a bridge to larger-scale geo-demographic analysis? For example, many times a retailer has attitude and usage data, as well as key demographic data, from a recently conducted market research study. The results of various advanced data mining analyses of these data can be meaningfully projected onto units of microgeography, to assist management with retail site selection, promotion targeting, etc.

You don't have to pay a geo-demographic syndicator a large fee for geocoding the data, overlaying their proprietary clustering codes, and analyzing the enhanced data. Instead, you can use much less expensive census data (which many retailers already have in-house) in conjunction with your own market research data, to achieve powerful results.

Here's a simple example. Let's say that you have conducted a survey that includes items measuring customer loyalty, heaviness of spending, or usage of a particular retail department. If your survey also includes standard, key demographic classification questions, then you can use advanced data mining techniques to build a predictive model. The dependent variable could be any of the aforementioned loyalty measures, and the demographics are the predictors in the model. Once you have a satisfactory model, you can use the results of the model to score units of micro-geography, much the same as you would use modeling results to score households (or businesses) in a customer or prospect database.

The trick is to translate the demographics from the survey respondent level to the micro-geographic level. Again, to use a simple example, let's say that you have discovered from the survey-based model that particular age groups are more loyal (or heavier spenders, etc.) than other age groups. Instead of scoring a household-level or business-level file using the various categories of age, you can instead weight the model coefficients by the proportions of a micro-geographic unit's population falling into each age group.

For example, age groups' coefficients from a regression model can be multiplied by the proportions of a micro-geographic unit's population falling into the respective age groups. If a particular age group has a strong coefficient, and/or they represent a disproportionately large part of the age groups in the micro-geographic unit, then that unit will achieve a higher model score.

This scoring procedure proceeds similarly for the various categories of the other demographic variables from the survey research-based model. After all microgeographic units of interest have been scored with all model parameters, standard tabulation and mapping routines can be used to perform retail siting, promotion targeting, and even plan-o-gramming.

It's a nifty trick, and it works even better if you plan ahead by designing your market research to include demographic items that have the same category breaks that standard census variables have. And if you already have a site license that allows you to use cluster codes from one or more of the popular syndicators such as Claritas (PRIZM), Donnelley, etc., then that's even better. The point is that advanced data mining techniques can significantly improve the knowledge discovery and application process, whether or not you have a site license for a proprietary geo-demographic and lifestyle targeting system.

 


Copyright © 1998-2007 SmartDrill. All rights reserved.