The new wave in mining data

On May 12, 2009, the New York Times described an approach where words reveal one’s thoughts.  It is a process that marketers have been testing for years and it is referred to as text mining.  The notion of making critical decisions based on words and phrases, and relationships between them, is rousing interest among marketers and analysts faced with more words and documents than they know what to do with.

Text mining is a first cousin to the more established data mining.  Both systems attempt to discern patterns and trends from huge data repositories.  The difference between regular data mining and text mining is that in text mining the patterns are extracted from natural language text – documents, for example – rather than from structured databases of transactions and other information. 

Databases are designed for computer systems to manage automatically.  In contrast, documents are written for people to read.  It is no easy matter devising computer systems that can “read” text.  However, a discipline referred to as natural language processing has provided some early success stories, as it tries to grapple with “reading” and summarizing large volumes of text. 

Text Mining: Two Functions

Analysts generally agree that the two primary functions of text mining are to make predictions and to classify data into segments.  One of the uses that has received increased attention is detecting insurance fraud.  While data mining has proved to be a valuable tool in detecting fraudulent behavior, there are difficulties.  Although the dollar loss emanating from insurance fraud is substantial, the percent of fraud transaction to total transactions is exceedingly small.  Small sample sizes frequently make the data mining exercise more problematic.  As a result, many claims that data mining algorithms identify as being suspicious are, in fact, authentic.  Enter text mining.  Analysts may discover that they could learn much from notes entered by customer service personnel. 

There are an increasing number of areas that are profiting from the value of text mining.  Market researchers can finally analyze responses to open-ended questions.  Product managers may be interested in mining Web logs and social media sites to learn about consumer opinions and identify communities of interest.  A computer manufacturer may want to study e-mail messages to determine how to deal with a customer communication and to whom to route the e-mail.  The call center can better cross-sell a new product or service depending upon the caller’s conversation.  The point is that text mining provides an additional weapon in the marketer’s toolbox.

Related Posts