Value-add of list compilers

I seem to have a propensity for gravitating toward lines of work that are difficult to explain to friends and family, let alone to my actual customers.

For many years, I was an industrial hygiene consultant and was concerned with the health and safety of factory workers. The most dreaded question at a cocktail party was “So what do you do for a living?” after which I would have to endure yet another joke about teeth cleaning.

In time, my expertise led to the compilation of OSHA violation data used in part for direct marketing of safety products. This, in turn, led to the compilation of what is now over 100 different federal, state and local public records databases containing 60 million records and a multitude of selects. Now, when asked what I do for a living, I simply say “I do database stuff” and leave it at that.

But sometimes, in spite of my best efforts at avoidance, someone presses me for details and invariably I am asked why, if the data is public record, don’t my customers simply get the data themselves?

Like any other business, the answer boils down to efficient use of time. Let me highlight some of the issues.

First, knowing where to get the information is a tremendous task in and of itself. Many government bureaucrats are loath to share data even though freedom of information laws mandate public disclosure.

I have sued the federal government on four separate occasions under the Freedom of Information Act, at considerable time and expense, to force the disclosure of data that was in the public domain. It has taken me years to identify, locate and obtain these records.

Second, government-held records are never in a consistent format. It comes in every conceivable flavor such as delimited or flat file ASCII text, database table, mainframe EBCDIC or Web-based HTML.

Name fields are never consistent and may take the form {FIRST NAME, LAST NAME, MIDDLE INITIAL}, {LAST NAME, FIRST NAME, MIDDLE NAME} or any conceivable combination of these elements either concatenated, with or without commas and with or without spaces (or double spaces).

Prefixes and suffixes like Mr., Ms., Mrs., Dr., Jr., Sr., III and IV all present their own sets of problems. And I haven’t even begun to discuss the problems with street addresses, city names, ZIP codes and other data elements.

This entire mess of raw data has to be processed with each update, sometimes quarterly, using customized and highly complex SQL, FoxPro or Access queries, which cumulatively were a small fortune in development costs. And naturally, the government frequently changes its record layouts, necessitating reprogramming.

Third, data is provided on a host of different electronic media such as CD-ROM, floppy disk, 3480/3490 tape cartridge, 9-track mainframe tape or 8mm data cartridge, each having its own specific hardware requirements. Some government data is so antiquated I must use a legacy MS-DOS program that had me searching the Internet for obsolete SCSII controller cards and motherboards upon which to run the program. Data storage, processing and backup require a highly sophisticated data center with racks of hardware.

Last but not least, making all the above hang together requires partnership with a reliable and honest list manager, which I have in the form of Mal Dunn Associates. Mal Dunn is responsible for sales, data enhancements (such as NCOA, deduping and appends), order fulfillment, customer relations, billing and treating me to a nice lunch during the Direct Marketing Association’s annual conference.

So the answer to the question is yes, customers could “simply” obtain the data themselves. But considering that time is a non-recoverable non-renewable resource, eliminating the list compiler would be like growing your own food instead of “simply” buying it at the supermarket.

Related Posts