Technologies Home Contents Contact Us Support Search Site Search Web Insiders
Benefits
About Us
Products
Technologies
Customers
Integrators
Developers
Documents
Demos
Glossary

Patents

Fastest index and query processing technologies on the planet!  Here's why...

OTHER APPROACHES
Data warehouse, federated database and enterprise search (to a lesser extent) systems are all conventional approaches to the same challenge of dealing with data; where it resides; how to access it; how to cope with "dirty data", typos, etc.; different types; different standards; different formats; different security; different locations; different owners; different systems; etc. Each conventional approach has its advantages and disadvantages. WhamTech combined technologies from these three approaches to provide dataless hybrid products, retaining the advantages and overcoming the disadvantages of each approach.

For structured database-type queries, query success is the same as, if not better than data warehouses and a lot better than conventional adapters in federated database systems.

For unstructured text search, query success is similar to search engines.

The reason for the high query success is that WhamTech products uniquely clean up and standardize data that is read from data source logs, or some other means, for indexes and then discarded, enabling queries submitted to the indexes to be highly successful. Further, WhamTech products enable advanced indexing including text search, fuzzy, aggregation, calculation and compound indexes.

WhamTech products retain the advantages of data warehouse systems: Clean data Indexes: multiple types Query processing: multiple options Security: data and access …and at the same time, retaining the primary advantage of federated database systems: Data remains at sourceEIQ Products derive their unique index and query processing technologies from previously marketed database and search products. 

 

 

WhamTech and its predecessors developed a relational very large database (VLDB) technology, called D the Data Language and later, Thunderbolt, that was extremely fast compared to other similarly configured database systems.  This appealed to a niche data processing intensive market; however, the real technology differentiator was the unique index and query processing technologies used in the current product, EIQ Server®, which indexes and processes queries against almost any and all data sources.  EIQ stands for External Index and Query.

From an operational and structural point of view, the relational index and query management system (RIQMS) embedded in EIQ Server, is a conventional relational database technology, with tables (virtual, in the case of EIQ Server), indexes, and typical database operation commands and queries.  It is NOT a memory-resident system; NOT a read-only system; NOT a fully inverted database system; and NOT a retrieval-based storage system.  There are, however, significant performance and capability differences that distance EIQ Server's RIQMS from other database and related technologies:

Unique Method of Query Execution

WhamTech's RIQMS has a unique method of isolating, connecting, arranging (sorting), processing (updating), and presenting (displaying) data.  This unique technology enables real-time data isolation and access; no matter the database size, number of concurrent users, or query complexity.

Unique Combination of Technologies

WhamTech's RIQMS's speed and advanced capabilities arise from a unique combination of three binary-level methodologies involving the three "Bs" of computing that are normally associated with static data warehousing, not on live data sources, as WhamTech provides for, and would be very difficult to improve on:

bulletBalanced binary trees that scale to billions of records, terabytes of data
bulletVirtual bitmap representations of intermediate and final query result-sets (known as Collections) - these can be integer lists or actual bitmaps, depending on field data node-level "data density"
bulletBoolean operations on Collections

Balanced binary trees are a technology from the 1960s and the attraction then, as it is now, is that binary searches are considered to be the fastest method of searching ordered lists[1], however, there are a number of practical problems associated with balanced binary trees; all of which WhamTech has solved (and that is the main secret to our success):

bulletPROBLEM: Levels tend to get very deep, whereby a binary tree consisting of a billion nodes, for example, needs 30 levels; this translates into time to traverse

WHAMTECH SOLUTION: WhamTech's implementation does not conform to the conventional n = log2(x+1) balanced binary tree rule, where n = number of levels and x = number of nodes.  Instead, WhamTech's implementation "leaps" levels to make binary tree traverse-time measurable in microseconds rather than milliseconds or seconds
 
bulletPROBLEM: Rebalancing and rotation after an insert or delete can take considerable time and a very large number of nodes can be affected

WHAMTECH SOLUTION: The maximum number of nodes that need rotated is just over 100; however, typically only 10 to 15 nodes are rotated after an insert or delete
 
bulletPROBLEM: A worst-case scenario of deletion of a top node, which is faced by almost all tree structures

WHAMTECH SOLUTION: This is one of the easiest operations for WhamTech's RIQMS to deal with, as it fits the technology well

With WhamTech's RIQMS, when a balanced binary tree is subjected to a query, the result is an integer list set of pointers to data in the data source.  For subsequent query-related operations, this set is either used directly or is converted to a Collection; an example is shown in the following diagram:

Complex queries are rendered simple by treating them as combinations of lightning-fast queries on multiple balanced binary tree indexes.  Once single-field Collections are isolated, they are combined using the full range of Boolean arithmetic operations to provide a complex query result set, as shown in the following example:

The bits in the final complex query result set represent the final result-set record numbers in the data source or multiple data sources.  Remember that Collections need not be bitmaps and are in most cases integer lists.  Boolean operations can be performed on actual, virtual bitmaps (integer lists) or both in combination.

WhamTech's RIQMS allows extremely fast queries and updates involving a minimal number of nodes, regardless of the size of the data source or the cardinality of the data.  WhamTech solved problems that have perplexed balanced binary tree researchers for decades and forced most companies to tend towards non-binary index tree structures, such as B+ trees, or other forms of indexes, where branches are > 2.  These non-binary tree structures do not allow for a simple 0 or 1 decision, but impose more complex decision-making algorithms on query processing at every node encountered, which increases traverse-time.

The core indexing technology code has remained untouched for over 15 years, as the algorithms and code are stable and bug-free.  As an example, WhamTech's legacy database product, Thunderbolt, was used at a NYSE $2+ billion per year revenue IT services company called ACS (www.acs-inc.com) for mission critical 24/7 operations support for over 12 years, generating considerable revenue for ACS.

The IP associated with WhamTech's RIQMS is indirectly protected through awarded patents.

Real-time Indexes

WhamTech's real-time indexes achieve rates from 100s/1000s to 10s of 1000s of records inserted/updated per second on low-level servers - a high-end example achieved a query and insert rate of 80,000 records per second on a dual-933 MHz server.  Real-time indexes establish a new method for dealing with large-scale data and information issues, from active (or real-time) data warehousing to near real-time database performance thought only possible with memory-resident databases.  Many applications are tending towards real-time, e.g., interactive customer relationship management (iCRM), inventory management, supply chain management (SCM), and decision support systems (DSS).

The largest problem faced by database vendors, in general, is enabling simultaneous queries (simple and complex) and data changes (inserts, deletes, and updates) on data and indexes for VLDBs such that they remain synchronized.

Most database systems are designed for transactions, a large number of users, and simple queries; for such systems, updates are mainly insertions and sequential.  Most data warehouses are designed to be normally static, with reduced subsets of data, a small number of users, and complex queries; for such systems, updates are typically performed in regular batches.  Complex queries on most database systems can cripple performance, particularly on VLDBs, but not WhamTech's...

... WhamTech's unique RIQMS allows for an entirely new approach to near real-time data and information integration and sharing systems.

Complementary versus replacement technology

WhamTech's EIQ Server is intended to complement, not replace, existing operations database systems, which are usually transaction-oriented, to enable:

bulletGrowth to accommodate a larger number of users
bulletAdditional capabilities for existing systems
bulletAdvanced capabilities not available with existing systems

EIQ Server, is the ultimate non-intrusive system, as it externally indexes and processes queries against existing data sources, including most databases.  EIQ Server also allows unstructured text search on databases as well as unstructured data sources such as files, documents and e-mail, and semi-structured data sources such as spreadsheets and XML.  EIQ Server has a unique approach to querying semi-structured data sources using structured queries.  EIQ Server can also execute structured queries on unstructured data if entity extraction tools are used to build indexes.

A great combination - database and search technologies

Many professionals in the database industry are predicting the convergence of database and search technologies.  Some database vendors already bundle limited search capabilities in with their products, and some search engine companies bundle limited database capabilities in their products.  EIQ Server is a much more seamless integration of database and search technologies -- because the search technology itself is based on the same RIQMS as the database query technology.

It is estimated that 85% of all enterprise data exists outside structured databases, and with company mergers and internal reorganizations, the remaining 15% could be in separate databases, with different structures and field names.  EIQ Server, potentially offers one-stop access to all corporate data regardless where it resides through either structured queries, unstructured text searches or BOTH on the same data sources.  EIQ Server is truly a great combination of database and search technologies.

Reference: 1. C. William Gear, Applications and Algorithms in Computer Science, Science Research Associates, Inc, 1978, p. A107.

Send To Printer

    

Copyright © 1998 - 2008 WhamTech, Inc www.whamtech.com 972-380-4645 info@whamtech.com
U.S. Patents Pending