Intarka's ProspectMiner Does One Thing, But Remarkably Well

In recent years, the deep thinkers of marketing have focused almost exclusively on managing relationships with existing customers. Since those relationships are affected by just about everything a company does, the result has been a steady inflation of marketing systems’ size and scope.

Today’s grandest visions involve all-encompassing product suites that merge marketing and traditional operations into a seamless, if somewhat Orwellian, whole.

One byproduct of this development toward customer management has been the atrophy of features associated with prospecting. They haven’t vanished entirely but have not matured either, and in some cases, they have even regressed. In fact, many of today’s marketing systems treat prospects as if they were just customers without a purchase history.

New products have been developed to fill this vacant niche. Since prospecting is one of the few areas outside the range of the ever-broader marketing suites, it is particularly attractive to small companies that cannot compete head-on with the behemoths.

ProspectMiner — made by Intarka Inc., 408/232-1000, www.intarka.com — is one of the most specialized competitors vying for a place in the world of prospecting systems. It does just one thing: build business-to-business prospect lists from the Web. And it seems to do it remarkably well.

Of course, there is no shortage of either business prospect lists or tools to search the Internet. What sets Intarka apart is its ability to generate highly targeted lists that contain relevant details extracted from live Web pages. As anyone who has ever attempted a manual Web search knows, this is quite a feat.

Intarka works its magic by splitting the process into three steps. The first involves identifying appropriate companies, which it does by first using existing search engines to find potential matches, then applying text-based filters to eliminate inappropriate entries. The second step is extracting information from both structured and unstructured data sources and putting it into a standard format. The third is distributing and presenting the information to users.

The technology underlying these processes is impressive but well-hidden from the casual user. To build a list, the user specifies conventional keywords, exclusion filters based on geography, business types and specific companies or terms and up to three Web sites that match the desired profile.

The system then automatically analyzes these Web sites using proprietary methods that look at the frequency, prominence and context of different words and phrases. One output of this analysis is a list of additional keywords that users can add to the search list. This setup takes an experienced user from five to 15 minutes to finish.

Once the initial settings are complete, the system uses the specified keywords to query 19 standard search engines such as Yahoo, AltaVista and Google. It builds a master list of all search engine hits, then eliminates duplicates, dead links, sites that are not businesses and sites that match the exclusion filters. It ranks the remaining sites based on their similarity to the original user-specified Web sites, again using the keywords and phrases it identified during the automated analysis. The system also notes any additional relevant keywords or phrases it finds in the new sites.

At this point, the user again steps in to review the rankings assigned to individual sites and to assess the additional keywords or phrases. The user can increase or decrease the ranking of a site and determine whether a particular word or phrase adds or detracts from a site’s score. (A common word might detract from a score if it distinguishes sites that are easily confused. For example, a search for members of the American Marketing Association would probably also find members of the American Medical Association; penalizing the word “doctor” would help eliminate some of the latter.) The system can then rerun the search and ranking processes using the adjusted criteria.

Intarka reports it usually takes one or two iterations, each ranking five to 10 sites, before a search is acceptably accurate. To speed this portion of the process, the system can run in a test mode that returns about 30 sites in a half hour.

Once the user is satisfied with the quality of the list being generated, a full search usually runs overnight. During this process, the system will also complete its second task: gathering specific information about the selected companies.

ProspectMiner draws from online sources including the site itself, corporate directories such as Hoover’s, news sources such as C-Net and Marketwatch and SEC filings. It again uses proprietary text-analysis methods to identify and extract the company address and phone number, names of corporate officials, financial data, a company description and a list of recent news stories, complete with links to the stories themselves.

Although results vary depending on what is actually available on the Web, the system can usually glean at least basic contact information and sometimes a remarkably complete dossier. Since the gathering process takes five to 10 minutes per site, it is limited to sites that rank above a user-specified cutoff.

Once the data are assembled, they can be reviewed online, e-mailed to appropriate individuals or exported into file format.

ProspectMiner was released in December 1998 and has been sold to about 40 companies. The system is currently offered as standalone software with a single-user license starting at $15,000 for six months or $25,000 for one year. It runs on a Windows 95 or later workstation with a reasonably fast Internet connection. The vendor plans to switch to a transaction-based model by the end of the year, allowing users to access the system over the Web and to pay based on the number of items found.