It’s too soon to describe analysis of text and other unstructured data as a mature technology. Too many people still discover clever new things on a regular basis. But unstructured data analysis is not so new that any system with related capabilities is automatically of interest.
This means the market has reached the stage where good technology by itself no longer ensures a company’s success. Companies also must find a business strategy that generates enough profitable revenue to fund continued product development and enough marketing to stand out in a crowded competitive environment.
Companies in this situation have two basic choices. One is to focus on the technology and try to be the world’s dominant supplier of something used by a lot of other people: think relational databases or search engines or PC operating systems. This is a high-stakes bet because in most technology markets only a few companies can sustain a major presence over time. The reason is “network effects” – the strong benefits that users gain from having the same systems as everyone else, in terms of compatibility and widespread technical knowledge.
The other choice is to develop applications that use the core technology to deliver a more or less complete solution to a specific customer problem. Customers have many different problems, so many different vendors can apply this strategy and coexist.
You might think that technical excellence would be a third viable strategy: be the best at a given function and charge premium prices to the few customers whose needs are so great that they will pay it. This works in many industries but rarely in software. It seems that the large research budgets of the dominant vendors let them quickly copy important innovations, and the pressure of network effects makes it ever more difficult for users to justify buying a non-standard product.
Predigy Platform (Intelligent Results, 425/455-5100, www.intelligentresults.com) represents the product of this dynamic. Intelligent Results’ original product was an engine to build clusters of related texts by identifying similarities in their contents. It is deployed worldwide by the U.S. Army to monitor communications and for commercial purposes such as uncovering common issues in customer service messages. The software provides a range of services, not only identifying the relationships among documents but also illustrating them graphically, naming the clusters and letting users drill down to the individual documents to see the actual words that generated the links.
Such functions are useful but far from unique. Last summer, Intelligent Results extended its product line with IR Modeler, a predictive modeling tool that can incorporate text clustering results along with conventional structured data as inputs. Intelligent Results has shown that adding unstructured data, such as the contents of call center notes when predicting attrition, yields more accurate models than those built with structured data alone. Modeler is a complete, automated system that can deliver predictive models in minutes, avoiding most of the manual data analysis and preparation traditionally performed by skilled statisticians (though it would be foolish to run a system like this without expert supervision).
The step after building a model is deploying it. This is a traditional roadblock to gaining value from modeling systems because the scoring formulas and data preparation steps often must be recreated in other technologies to embed them within production systems. Intelligent Results has built IR Prediction Engine to address this need. Prediction Engine accepts model formulas and data rules from Modeler and converts them into a Java file that can be connected with customer contact systems. It can respond in real time to queries from those systems or append treatments to input files for batch execution.
What next? Intelligent Results could extend its solution to provide the customer contact systems themselves. But that would be a bad decision. Contact systems like call centers and billing engines are major production applications that firms rarely replace, let alone buy from small vendors whose core expertise lies elsewhere. And because Intelligent Results usually will need to integrate with customer contact systems from other vendors, it would be unwise to antagonize them by setting up as a competitor.
Intelligent Results chose instead to address the decision-making process. Model scores are one component of this process but are embedded in a larger set of rules that define a customer strategy. IR Strategy, releasing this month, gives Predigy Platform a way to manage the strategies themselves.
Strategies in IR Strategy are based on decision trees. These start with a customer group and split it into segments that will receive different treatments. The splitting rules can incorporate scores created by Modeler or imported from other systems, as well as any other types of data. Data analysis and visualization functions help users specify break-points in decision rules.
Users also can build champion-challenger strategies and assign target metrics, costs and expected values to each tree branch, giving the system an ability to simulate strategy results and to recommend decision rules based on statistical optimization. IR Strategy trees can include an unlimited number of decision layers, though a typical implementation manages a specific task such as assigning credit limits or debt collection.
Integration of IR Strategy with operational systems runs through the IR Prediction Engine, which can encapsulate Strategy rules and data inputs in addition to scoring models. Results are presented by IR Reports, which builds a data mart to measure strategy outcomes, model performance and operational statistics such as response time. Though information generated by Predigy Platform components themselves is posted directly to the data mart, the IR Prediction Engine does not capture feedback from customer contact systems regarding actual outcomes of decisions it recommends.
Pricing of Predigy depends on details of the implementation but typically starts around $100,000 for a single server license. This includes all of the system modules and allows several users to coordinate their efforts. Intelligent Results released its first product in 2002 and has more than two dozen installations in total. n