Introduction
WhamTech’s legacy database technology
uses physical row numbers internally, a.k.a. record numbers or ROWIDs, which act as pointers from indexes to data in a database;
these pointers are represented by integers in WhamTech’s indexes. When WhamTech’s indexes are used for other databases, these
internal ROWIDs are mapped to external ROWIDs or other means of
direct access to source data.
Each database vendor
has their own form of ROWIDs or unique primary keys that can be
used for identifying rows within tables; WhamTech makes use of
each database’s form of ROWIDs. ROWIDs or primary keys are
acknowledged to be the fastest route to data in a database.
Indexing
EIQ Server indexes
data as it receives it in batch or incremental load using ETL, or
near real-time using replication from database transaction logs,
change logs or triggers. One of EIQ Server’s underlying strengths is the capability to
index large volumes of data and make it immediately available to
queries in near real-time. Multiple types of indexes can be
generated and maintained, including
cleansed, fuzzy, SOUNDEX, metaphone, stemming, synonyms, and context
indexes. There is almost total freedom in applying index
algorithms.
Query Processing
EIQ Server processes
almost all queries 100% in the indexes, including table-joins
and range queries. Only when a final query result-set is isolated,
is the actual raw data in the database retrieved. This has many
benefits, including minimizing contact between EIQ Server and the
source database, resource usage, performance, and multi-user
support.
Metadata
Dictionaries
Any data and
information integration and sharing system needs to have an agreed
metadata dictionary, comprising standard field names and
attributes for an organization or cooperative venture; this is
critical to success. Fortunately, EIQ Server can cope with
multiple metadata dictionaries. Internally, EIQ Server
builds indexes using proprietary names and then maps them to a
common standard. Other non-standard metadata can be mapped
to the common standard, thus from many one-to-one mappings, this
enables many-to-many mappings between multiple metadata
dictionaries. See the table below.
|
DATABASE |
INDEX BUILD
(Usually at EIQ Server level) |
COMMON STANDARD METADATA MAPPING
(Query execution EIQ Server level) |
NON-STANDARD METADATA MAPPING
(Usually at query generation EIQ Server
level) |
|
Database Field |
Data Transform |
Index Field |
Query Transform |
Domain Value
Mapping |
Metadata Field |
Query Transform |
Domain Value
Mapping |
Metadata Field |
The capability to
manage multiple metadata dictionaries allows different
organizations, with their own metadata dictionaries, access to
same databases used by others. Metadata dictionaries can be
defined by organizations, industry workgroups, government
agencies, or groups of countries, e.g., the European Union (EU).
SuperSchemas®
WhamTech introduced a
concept called SuperSchemas, as a substitute for a conventional
query schema. A SuperSchema is either a flat table or a
relational set of tables containing standard metadata that is
desired as the final result-set from queries. As mentioned
in the Metadata Dictionaries section, metadata is ultimately
mapped to actual database fields. Applications, portals and users use SuperSchemas to
query multiple, disparate databases. Actual queries made to
individual databases are executed on an EIQ Server index, NOT
submitted to the source database, as is the case in a federated database
approach.
For SuperSchemas to
work, each database has a simple local mapping file created and
maintained by the local database administrator (DBA), which maps
standard metadata field names to the field names used by the local database and
defines the primary and foreign keys that are used to access one
table from another; these are bi-directional pairs. There may be
differences in attributes between database fields and standard
fields; however, much of these transforms can be taken care of in
the ETL or replication process, and the same rules applied when raw
source data is retrieved. Ideally, these transforms should take
place at the server level, but in some cases, such as dealing with
a database remotely,
mapping and transforms may have to be performed at a higher EIQ
Server level.
SuperSchemas allow
applications to work with data and information regardless of where
it is and the format it is in. Also, this approach really
simplifies application development, as there is no need to conform
to an underlying data schema at the adapter or database level.
EIQ Server Index Update Options
EIQ Server indexes
can be updated as follows:
-
Batch or
incremental
-
Near real time
using replication from transaction logs or database triggers
-
Real time using
two-phase commit or an application-aware API
-
Remotely through a
standard user-level access route and updating indexes at the
results-level
Compatible Databases
WhamTech is
concentrating on developing EIQ Components to common databases and
has initially identified NCR Teradata, Oracle, DB2, ADABAS, SQL Server,
and MySQL. Other databases and flat-file systems will be included
over time. |