Enhance Lists With Overlay Data
However, all of this is contingent on properly interpreting the output reports from a demographic overlay. This is something that most direct marketers assume to be a straightforward process. They reason that anyone can understand the averages, means and frequency distributions that comprise the variable-by-variable profiles. Therefore, they do not hesitate to draw sweeping conclusions that can have profound strategic and tactical ramifications.
The problem of missing data. It can be more difficult than one might think to interpret the profile reports that are generated by the demographic overlay process. Several analytical traps frequently snare the untrained.
We will focus here on missing data, which corresponds to that proportion of customers, inquirers and/or prospects for which overlay information does not exist. We will illustrate how it complicates the interpretation of profile reports and can cause significant distortions.
A dramatic example occurred when a well-known data compiler performed a demographic overlay on the subscriber files of several magazines owned by a publishing conglomerate. Results were organized into a formal presentation and delivered to the client's CEO. One finding was that active subscribers to one of the magazines had an average age of 44.
Effectively, that was the end of the presentation. Because of this assertion, the compiler immediately lost credibility with the CEO. Here is what had happened:
To assist in selling ad pages, the publisher had done extensive survey research on its subscribers. Many of the hundreds of data elements on the compiler's overlay file did not intersect with the publisher's research. But the age element did. As a result, the CEO knew the average age of the title in question to the nearest tenth of a year, which was just a hair over 30.
Individual and household overlay data generally cannot be applied to a significant portion of a given file. The magnitude generally ranges from 15 percent to 95 percent for specific data elements. In this case, about 80 percent of the magazine's active subscriber file could not be overlaid with age data, which is extreme for this element.
When calculating the average age, the compiler focused on the portion of the subscriber file for which age had been successfully appended. By doing this, the compiler implicitly assumed that those individuals for whom age was not known had the identical profile.
Individuals who can be coded with a given data element are almost always demographically different from those who cannot. This is because representation on major overlay databases is skewed toward older, more stable individuals. The explanation lies with the two reasons for not being codeable.
One, there has been a change of address that is not reflected on the database. Technical reasons contribute to this effect, some of which are related to the National Change of Address process. Sometimes, it is just that an NCOA form has not been filled out.
Two, no data exists for the individual. The extent to which an individual has a home, automobile, credit cards, children and the like is the extent to which he or she is likely to be represented on a given overlay database. Those who cannot be matched to an overlay database tend to be young renters who move frequently. These people generally also are not affluent and not married.
Generally, the magnitude of the swing between reported and actual age is directly proportional to the degree to which the percentage of records with missing data deviates from the norm.
Returning to our publishing example, the 20 percent of the file that was appended with age did, indeed, have an average age of 44. However, the other 80 percent averaged 26.5. In other words, the vast majority of the magazine's active subscribers were in their 20s. However, the overlay had missed this core target market. Imagine the problems that would have resulted had the publisher begun to focus on acquiring 44-year-olds.
Admittedly, a 14-year swing between an adjusted and unadjusted data element is the exception, not the rule. However, swings of four to six years are common.
Fortunately, techniques exist to adjust demographic and lifestyle profiles for the systematic bias inherent in missing data. Though space does not exist in this article to discuss these computationally intensive processes, it is important for direct marketers to know that they exist.