Enhance Lists With Overlay Data, Part 3This article continues the discussion of several analytical traps that frequently snare the untrained when evaluating profile reports. Only by avoiding these traps can demographic overlay data be leveraged properly to improve your company's top and bottom lines. (The first two articles appeared in DM News on April 7 and Sept. 8, 2003 and are accessible at www.DMNews.com.)
Confidence levels and ratios are the foundation of any profile report. To illustrate, we will use an expanded version of the age of head of household chart that appeared in the Sept. 8 article. We will focus on these age ranges: 18-24, 40-44 and 85+.
Confidence level is the degree of certainty that a result did not occur because of chance variations in the corresponding samples. For example, we are 99 percent confident that there really is a higher penetration rate of 18- to 24-year-olds among coupe buyers than among sedan buyers. However, we are only 45 percent and 24 percent confident, respectively, that 40-44 and 85+-year-olds are more highly penetrated among coupe buyers.
The confidence statistic is frequently misunderstood. It does not, for example, translate to 99 percent confidence that the penetration rate for coupe buyers ages 18-24 is exactly 7.3 percent and 3.6 percent within sedan buyers. Instead, it means we can be 99 percent certain that the penetration rate among coupe buyers (regardless of the specific amount) is higher than among sedan buyers (regardless of the specific amount).
Direct marketers often employ confidence levels of 90 percent or even 95 percent as the dividing line between "statistically significant" and "statistically insignificant" results. However, hazards are associated with this approach. Just about any DMer would dismiss a 67 percent confidence level as statistically insignificant. But this translates to 2-to-1 odds that a difference really exists. Often, odds such as this are worth further investigation.
Therefore, the confidence statistic should be considered an aid to decision making, not a rigid rule that offers no option but acceptance or rejection of an observed result. It is important to allow yourself the option of a "maybe" conclusion, where further sampling is used to reach a definitive finding.
The confidence statistic is sensitive to sample sizes. Generally, very small samples correspond to a low level of confidence. This is consistent with the expectation that the results of small samples often are merely chance occurrences. This sensitivity to sample size is a strength as well as a weakness in real-world decision making. If the sample size is extremely large, then the confidence level often is very large, even though the difference between the two percentages is not sufficiently consequential to have any practical application.
Given extremely large sample sizes, almost any non-zero difference between two percentages will display a high enough level of confidence to be deemed statistically significant.
Consider, for example, the respective penetration rates of 13.6 percent and 13.8 percent for the 40-44 age group within coupe and sedan buyers. With the current sample sizes, our confidence that the two rates are different is only 45 percent. If we raised the sample sizes to 160,000, we would achieve a confidence of 95 percent. Nevertheless, there still would be no practical difference between 13.6 percent and 13.8 percent.
A ratio is a measure of the magnitude of difference between one value versus a second, or "base," value. Typically, a ratio is obtained by dividing the first value by the second, multiplying the result by 100, then rounding to a whole number.
If the ratio equals 100, then the two values are identical. If it is greater than 100, then the first value is higher than the second. If it is less, then the first value is smaller. Therefore, for the 18-24 age range, the ratio of 205 means that the penetration rate among coupe buyers is 2.05 times (or 105 percent of) sedan buyers. (Note: If you do the math using the chart's penetration rates of 7.3 percent and 3.6 percent, you will arrive at a ratio of 203. The discrepancy is because of rounding.)
Unlike a confidence statistic, a ratio does not take into account chance differences, nor is it sensitive to the sample sizes upon which the underlying percentages are based. This is apparent within the 85+-age range, where the coupe-to-sedan ratio is 77. However, the sample size is so low that the confidence is only 24 percent.
Another limitation of the ratio is that it can be impressively small or large even though the two percentages being compared are inconsequentially small. For example, if coupe buyers for the 85+-age range were 0.1 percent, then its ratio versus the 0.7 percent for sedan buyers would be an extremely low 14. But because the corresponding percentages for both coupe and sedan buyers would be less than 1 percent, the practical marketing applications would be inconsequential.
Finally, a ratio that compares two percentages is mathematically constrained because the percentages themselves have a ceiling of 100. Therefore, as the "baseline" percentage approaches this upper limit, the maximum possible value of the ratio gets reduced. For example, given the age 40-44 penetration of 13.8 percent within sedan buyers, the theoretical maximum ratio for coupe buyers is 725. However, if the sedan penetration were 80 percent, the maximum ratio would be just 125.
Technically, no direct relationship exists between the confidence statistic and the ratio. In other words, a ratio does not have a confidence statistic attached to it. Instead, the confidence statistic is nothing more than a number that indicates the likelihood that two percentages are different.
However, when the percentages and associated sample sizes are not extremely large or small, the confidence statistic and ratio tend to "line up" and tell the same story about comparative sizes of the two percentages.
Confidence statistics and ratios should be interpreted holistically, and with great care. When reviewing a profile report, it is important to focus first on the sample sizes on which the percentages are based. Also, think about the magnitudes of the percentages.
Finally, overlay your own judgment of the real-world importance of the percentages, and their associated universe counts. The combination of statistics with human judgment is the best recipe for improved business clarity and better decisions.
For more reading on this topic, see "Individual/Household Demographics & Psychographics: Applications in Descriptive & Predictive Research," The Direct Marketing Association's 1997 Research Council Journal, www.wheatongroup.com.