# Enhance Lists With Overlay Data, Part 2

Part 1 of this article (DM News, April 7) showed how demographic overlay data can improve your company's top and bottom lines. However, it is tougher than one might think to interpret the profile reports that are generated by the overlay process. Several analytical traps frequently snare the untrained.This month's article will focus on how to evaluate profile reports. It also will provide insight on how to determine the quality of the ones that have been supplied by your vendor. Many of the profile products on the market today have significant shortcomings.

The article is based on a real analysis that was run for a specific automobile model. Because the analysis was done more than a decade ago, we are able to share one of the profile tables. In this table, the buyers of coupes were profiled on age of head of household, then compared with buyers of sedans as well as a sample of the entire United States.

For each of 14 age ranges, the quantity of coupe buyers is provided, followed by each range's percentage of the 62,492 coupe buyers for which the age variable was available. A critical component is the breakout of coupe buyers for which age was unavailable. This group numbers 50,230, or 44.6 percent of the 112,722 overall coupe buyers.

As was explained in the previous article, individuals who cannot be coded with a given data element almost always differ demographically from those who can. This is because representation on major overlay databases is skewed toward older individuals who live in more geographically stable households. Noncoded individuals tend to be younger and more mobile. If this skew is not accounted for, the resulting profiles will be misleading. The next two columns - under Coupe Estimate - illustrate why this is true.

In the first of these columns, an approximation algorithm was used to allocate, across the 14 age ranges, the 44.6 percent of buyers for whom no age information exists. This is a sophisticated process whose mechanics are beyond the scope of this article. The resulting allocations can be remarkably accurate, and are applicable to a number of overlay data elements. However, many commercially available profile products do not include such estimates.

Compare the estimated Percent of Total column with the actual Percent of Total. For example, 18-24-year-olds account for an estimated 24.6 percent of the 112,722 overall coupe buyers. This is drastically higher than the "actual" 7.3 percent of the 62,492 age-coded buyers. Clearly, focusing on the 7.3 percent would be very misleading from a marketing perspective. The "actuals" for the 18-24 and 25-29 age ranges reflect significant under-representation. All other ranges reflect over-representation.

The next two columns compare coupe vs. sedan buyers. The ratio of 205, for example, means that the estimated 24.6 percent of coupe buyers who are 18-24 is more than twice that of sedan buyers. Interest in coupes vs. sedans is highest at 18-24, then consistently declines to a ratio of 77 at 35-39. Then there is rising interest to 45-49, where a secondary peak ratio of 123 occurs. Next, there is a long decline to ages 75-79. Finally, there appears to be a small uptick beginning with 80-84-year-olds, though the corresponding confidence statistics suggest that this is not clear-cut.

From a lifestyle perspective, this makes sense. The young, who often are single and have no children, find two-door vehicles more appealing than four doors. So, too, do 45-54-year-olds, who buy the coupes as secondary vehicles for themselves or their teen-age children. There is lower interest in coupes during the prime child-rearing years, when a premium is placed on four-door practicality. Interest also declines in the senior years, because the two-door design can be difficult to enter and exit.

Ratios are particularly important in the absence of an approximation algorithm to adjust for missing data and generate estimated "Percent of Totals." This is because unadjusted "actuals," though suspect from an absolute perspective, can be compared with each other to create valid relative metrics. For example, consider again that it is misleading to conclude that 7.3 percent of overall coupe buyers are 18-24. Yet even in the absence of the adjustment to 24.6 percent, we can be certain that the concentration of coupe buyers in the 18-24 age range is twice that of sedan buyers.

The second column in this "comparison against sedan" section is a confidence statistic. This is based on something called a Z-Score. The confidence statistic is frequently misunderstood. Consider, for example, the 99 percent confidence that is associated with the 18-24 age range. It does not translate to 99 percent confidence that the coupe-to-sedan ratio is exactly 205. Instead, it quantifies the likelihood that the percent of total for coupe buyers is statistically different than for sedan buyers.

Confidence is a function of the similarity between the percent of total coupe and percent of total sedan buyers, and of their corresponding sample sizes. By definition, a ratio of 100 will have a 0 percent confidence, regardless of the sample size. Notice that the 40-44 age range has a confidence of only 45 percent. The main reason is that its ratio of 98 is very close to 100. Conversely, the primary driver for the low confidences associated with the 80-84 and 85+ ranges is small sample size. Therefore, the apparent uptick in their ratios might be merely a statistical mirage.

The final column compares the age distributions of coupe buyers versus a sample of the U.S. population. Coupe buyers display higher ratios among 18-29 and 40-59-year-olds. The peaks and valleys for the U.S.-derived ratios are often more extreme than for the sedan-derived.

Demographic overlay data can improve your company's top and bottom lines. But without robust profile reports, and their knowledgeable interpretation, it is easy to draw inaccurate marketing conclusions. Be particularly mindful of the effects of missing data, and the corresponding need to accurately adjust profile reports.

(For additional reading on this topic, see "Individual/Household Demographics & Psychographics: Applications in Descriptive & Predictive Research," the Direct Marketing Association's 1997 Research Council Journal, www.wheatongroup.com.)