QD Technology's database aims for a quick response
There are many ways to organize data: flat files, XML tags, networks, hierarchies, cubes, columns, objects and others still more exotic. But by far the dominant database management systems today are relational databases like Oracle, DB2 and SQL Server. These products are designed primarily for transaction processing for example, to add, change and remove individual records.
The features needed for transaction processing sometimes conflict with the features needed to analyze records in large groups. But relational databases can be used for analysis through a combination of feature extensions, clever database design and powerful hardware.
Although this approach adds cost, many companies prefer it to the alternative of making their technical environment more complicated by bringing in another database engine designed specifically for analytics.
Such analytical databases do exist. Marketers, in particular, have frequently chosen to use them because they wanted the speed, flexibility and low cost that they provide. The leading products in this group have changed over the years, but the dominant products for marketing applications are currently Alterian and SmartFocus.
Both organize data into columns (for example, all last names or all ZIP codes), so only the items needed in a particular query can be loaded to resolve it. This reduces the total amount of data to be retrieved from storage, which is usually the major determinant of query response time.
Both products also use compression and indexes to further reduce data volumes and increase speed. In addition, they provide specialized query languages that simplify tasks that are difficult in a conventional relational database. These languages are embedded in the systems' own query tools.
Quick Response Database , a product of QD Technology (www.qdtechnology.com), is another competitor in the analytical database category. Like other analytical systems, QRD discards the update management features needed for transaction processing. Users load data from existing sources through a batch process that compresses and indexes the inputs before storing them in the QRD format.
The system automatically analyzes the inputs and applies different compression and indexing methods based on what it finds. Once the data is loaded, it cannot be changed directly, although incremental files can be added with new and changed (but not deleted) records. These incremental files remain physically separate from the original but are automatically merged by the system during query processing.
QRD's compression and indexing yields a file that takes an eighth to a tenth as much space as the original input. The actual amount of compression depends on the input: large blocks of text compress less than numbers or coded values.
In addition to the compression itself, the system gains speed by using indexes to resolve queries when possible, by storing data in large blocks to reduce retrieval times, and by decompressing only the records needed to display query results. QD Technology claims queries run 10 times faster than on a conventional relational database. The actual improvement depends on the details.
Unlike systems that convert the inputs into columns, QRD retains the original data structures of its inputs. The system accepts queries in SQL - the language used by nearly all relational database systems - through a standard open database connectivity (ODBC) connection.
Because it uses both standard SQL and the existing data structures, queries built to run against the original data source will typically run against QRD with little or no change. This is a major advantage for companies with extensive libraries of existing queries and with large investments in standard query tools such as Business Objects or Cognos.
QD Technology is selling QRD as a tool for desktop analysis, not a replacement for a primary marketing database. Its application provides regional analysts with subsets of an enterprise marketing database, so they can run their own selections rather than waiting for the work to be done at headquarters.
Another example is providing fraud analysts with desktop copies of detailed transaction histories, so they can easily research large amounts of data.
Such applications require frequent updates so the users are working with fresh information. Database compression in QRD runs 5 to 10 gigabytes per hour on a Windows server, placing significant limits on the amount of data that can be processed overnight or a weekend.
The system has been tested with 20 to 100 gigabytes of input data - fairly small amounts by today's standards - although these can be extracts from much larger databases. Because the incremental files do not include deleted records, a full rebuild is needed periodically to keep the information accurate.
In a typical configuration, compression runs on a central server and compressed files are then distributed to analysts who run them on their personal workstation. The system accepts relational database tables and delimited files as inputs. Relational databases must have both ODBC and JDBC connections available for the system to read the source data structures automatically.
Since QRD loads each source table independently, users define relationships among the tables when they set up individual queries. This allows the same flexibility as any standard SQL environment. Queries can create calculations and temporary data tables, but cannot write back to a database.
The system stores the decompression rules within each QRD file it distributes. This allows query results to display the data in its original, uncompressed form. It also lets users recreate the original input tables without referring to any external documentation.
QRD runs on Windows XP or Server 2003 servers and desktops. The system includes several server components to manage compression and distribution of the QRD files. A smaller set of desktop components receives the QRD files and provides the ODBC connection to third-party query tools.QRD has been under development since 2004 and has been tested at several large financial services companies. The first commercial release was last fall and has been sold to about a half-dozen buyers. Pricing is based on an annual subscription and ranges from $100,000 to $250,000 based on the number of users. A short-term trial license is available for less.