Achieving Advanced Data Integration

Not long ago, enterprise software suites seemed poised to take over the world. The vision was irresistible: one product to buy and maintain; one place to store data; and, best of all, no need to integrate multiple systems. The cost savings alone were worth the price, though the real value would come from faster, more efficient operations.

Reality can be so annoying.

It wasn’t merely that enterprise software implementation turned out to be famously difficult, time consuming and expensive. Nor that even the finest suite always had a few components that didn’t quite meet all business requirements. The problem was much simpler, and will be familiar to anyone who has tried to rid a city apartment of cockroaches. There are just so many of the darned things – in this case, existing computer applications – that it’s nearly impossible to find and remove them all. (Apologies to any insects insulted by the comparison.)

Sadder but wiser, technology managers have accepted that they will be dealing with multiple systems for a long time. Still, technophiles that they are, they continue to seek an automated alternative to the traditional approach of custom-building connections between each system.

One set of relevant products, enterprise application integration (EAI) software, has been available for some time from vendors such as Tibco and WebMethods. These products let transactions in one system trigger related transactions in other systems, transferring required data as needed. Prebuilt connectors make it relatively easy to integrate common software applications without custom programming.

But from a marketer’s view, such products are incomplete. They assume that relations between data in different systems are established precisely: for example, through shared keys such as part numbers. But though consistent identifiers are available for many kinds of business data, it’s a rare firm that has created a single set of customer IDs. Much more common is that customers are assigned IDs independently as they are added to each operational system.

Sharing information across such systems requires some way to determine which IDs refer to the same customer. Technology for this has long been available from vendors like Trillium, Acxiom, AbiliTec, Firstlogic, Group 1, Innovative Systems, Search Software America and D&B. Their products compare names and addresses from different sources either directly against each other or against a comprehensive reference database. Either way, the result is a cross-reference table that links multiple IDs to the same customer.

But a cross-reference table is not enough. Companies also must access the detailed information buried within their operational systems. In cases such as call center or Web site interactions, this access must be near-instantaneous or an opportunity may be lost. In addition, the information must be the most current, accurate data. This is a somewhat contradictory requirement, as it implies both that information from each source system should be made available immediately and that new information should be evaluated carefully before it is used. Quality becomes an even more pressing issue when integration extends to changing files in one system based on information captured elsewhere.

Several products help firms meet these requirements. Vendors include Chordiant (, DWL (, eConvergent ( and Xoriant ( These systems present a unified view of customer information stored in different company systems, essentially simulating the single customer record that would exist if a firm were running all operations on one integrated software suite.

These products contain three main layers: connections to company source systems, a cross-reference table for customer IDs, and presentation services to let other systems access the customer data. The connection and cross-reference layers use largely conventional technology. In fact, several vendors build their cross-reference tables with matching software from the specialized vendors already mentioned. What sets these systems apart is the presentation layer. This is designed to simplify access to the underlying customer data so that client systems need not deal with the true complexities.

Capabilities include consolidation of data from multiple systems, business rules to identify and resolve conflicts among sources, security to control which users and client systems can view which data elements, and formatting to present data in appropriate ways for different clients. Analytical functions may also generate metrics or track behaviors that require combining data from multiple source systems.

The need for near-instant response means that these systems cannot rely solely on queries against the underlying source systems. At least some customer information must be pre-assembled in customer profiles that are already available when a request is made. This saves the time needed to run complex queries, consolidations and analytical processes before returning a response. The customer profile also may be a place to post operational transactions as soon as they occur, making changes visible to client systems even before the data is copied to source systems. This is important because some source files are updated periodically rather than in real time.

The profile table also provides a place to store information that does not reside in any existing source system, such as metrics, model scores or hierarchies that link related customer records. It may similarly act as a repository for the "best" version of information that exists in conflicting forms in different source systems. Some of these conflicts reflect errors that could be corrected, such as an out-of-date address. But other differences will be legitimate variations that cannot be changed for legal or operational reasons.

Though profiles are important, they hold only a subset of the customer information contained in each source system. The presentation layer therefore also includes functions to help client systems access the underlying source data directly. Again, these functions simplify access by managing connections, consolidation, security and formats. Some integration products include an actual user interface for directly viewing such data, complete with data search and drill-down capabilities. Others merely present the data for the client system to display itself.

Related Posts