True Modeling Needs to Start With Penetration AnalysisI would like to add to the discussion of CareerTrack's use of regression for modeling of its list ("Seeing the Extremes in Regression Modeling," April 20). It is essential to note that the analysis was based on three types of data -- one exceedingly important (for a business-to-business list test), one of quite some significance and one (demographics along the lines of consumer attributes) insignificant insofar as such demographics have any great relevance to the business list.
Let me explain: Every record of the CareerTrack customer list used had a six-digit SIC and one of nine codes for number of employees. At the heart of the study was a penetration analysis (read this as share of the total market) by each SIC within each of the nine employee strengths. This provided a substantial underpinning by establishing a given percentage for each SIC. Some given SICs already showed penetration of 10 percent or more. For such SICs, CareerTrack already had one in 10 of all such SICs. This penetration percentage ranged from 1 in 10 down to virtually zero percent.
Penetration alone is a remarkable measure of potential. For Database America, such a penetration effectively stipulates that of 10 million businesses no more than 2 million can be economically mailed for new prospects. In other words, if all were to be mailed, 80 percent would prove to be a waste.
Now the second contribution to the CareerTrack model came from a superb database providing data, by individual SIC, for transactions. This permitted DBA Analytical to incorporate recency, frequency and dollars for all transactions over a given period of time.
Note that by themselves, transactions need to be organized in some pattern that will enhance prodictive value. For this purpose, the key is the SIC. By weighting each SIC by transaction, this provides a means to vary each penetration up or down. And, of course, this augmented SIC then can be arranged in descending order to poll samples from the prospect list, which inevitably will out-pull data simply polled by SICs.
The third type of input -- demographes has little weight. If in fact, demographes only had been analyzed, improvement over straight SIC selection would be for all practical purposes negligible.
What this leads to is that the true modeling of a BTB customer file needs to start with a penetration analysis -- and, interestingly enough, the great majority of BTB operatives can stop there economically and efficiently. It is only the very large customer files like those at CareerTrack where adding transactions will prove fruitful.
So, while this does not get into the argument for the validity of regression, the essence is clear -- it is share of market that should be discussed, not the statistical means used to present it meaningfully for its productive power.
Promotional Consultants Inc.
(formerly with Database America)