Glossary Home Contents Contact Us Support Search Site Search Web Insiders
Benefits
About Us
Products
Technologies
Customers
Integrators
Developers
Documents
Demos
Glossary

Glossary of terms used in the WhamTech Web sites

This list not only defines basic elements of database technology, but it also illustrates how WhamTech database and search technology is redefining them.

5GL Stands for fifth-generation language; a category of programming languages that use concise English-like statements to generate complex, detailed code. SQL is an example of 4GL - fourth generation language.

Active Data Warehousing is a relatively new term used to describe a new and rare breed of data warehousing that allows simultaneous real-time/continuous updating and querying.

aggregation  Performance of a DBMS operation on an entire set of data records at the same time.

API  Stands for Application Programming Interface; commands by which application programs make requests to the operating system.

balanced binary tree  ...is a data structure composed of pairs of branches, beginning at a single point, or root. Each point of branching is called a node, or parent.  In a binary tree, each node can have only one or two leaves, or children.
   
A binary tree index can be used for extremely fast searches. To construct such a tree, beginning with the root at the top and growing downward, at each branch the lower-valued record number (or other key value) goes on the left, the higher-valued on the right. New branches are added to the bottommost nodes.
    A binary tree index is said to be balanced when, even after repeated insertions and deletions of nodes, the tree remains symmetrical (the same number of nodes on the left as on the right). WhamTech considers a binary tree to be balanced when the number of nodes is the same on each side of the root or top node, or one side has one extra node. Two or more extra nodes is considered out of balance. The better balanced the tree, the faster and more efficient the search.
    Asymmetrical, or unbalanced, binary trees can be made symmetrical (i.e. more efficient for searching) by re-sorting the nodes in a process called rotation. However, the more processor cycles are used for performing rotation, the less computational capacity will be available for searching. That is, the more rotation required to keep a binary tree in balance, the lower the performance of the overall search scheme. The node rotation problem increases geometrically with the number of nodes, which in traditional database index systems is directly related to the size of the database. WhamTech has solved the problem that faces core database technology developers, in two ways, as follows:

1. Thunderbolt VLDB's balanced binary trees depend on the cardinality of the data rather than the size of the database.

As database sizes increase, high cardinality data tends to increase the number of nodes, e.g., Last Name.  Low cardinality data tends to remain unaffected, since cardinality is finite, e.g., Gender (2 possible values), State (52 possible values, including DC and "unknown").

2. WhamTech has a proprietary method of building and maintaining balanced binary tree indexes that minimizes the number of rotations required REGARDLESS OF THE SIZE OF THE TREE. Query, update, addition and deletion operations are not significantly affected.

WhamTech has the capability to simultaneously query, update (including additions and deletions), and maintain balance of a binary tree in real-time. As far as WhamTech is aware, no one else has solved this problem, making this technology uniquely able to manage massive volumes of data - a true real-time VLDB. See Benefits.

bitmap Collection  A WhamTech VLDB technology result set consisting of record numbers in the database, represented as an array of binary ones and zeros. See Technologies.

cardinality In database terminology, cardinality is the total number of unique occurrences of an entity (e.g. person, organization, or transaction) that can participate in a relationship. An extremely high cardinality field would be a Web site address, e-mail address or phone number; in fact, these should be unique. A high cardinality field would be street address - not unique, but close - there may be more than one "1234 Morningside Drive" in a particular State or in the US. A low cardinality field would be State - only 52 variations, including DC and "unknown". An extremely low cardinality field would be a check box - only two variations.

Collection  Proprietary WhamTech term for a set of record numbers that result from a WhamTech VLDB technology query. Record numbers may be represented either in binary or in integer form. See Technologies.

COM  Stands for Component Object Model; Microsoft programming environment.

crawler  See spider.

CRC Stands for Cyclic Redundancy Check; a type of algorithm used to verify that no errors have occurred in copying or transferring blocks of data. WhamTech uses its own CRC algorithm to reduce storage and accelerate queries, and extremely fast basic 64-bit one-way encryption.

data mining  Category of DBMS applications that seek to find new information and relationships within multiple, often heterogeneous, legacy data stores; for example, searching and analyzing customer sales transaction detail to determine buying habits by ZIP code or other demographic criteria.

database  A file management system that is usually considered relational. See also relational.

DBMS  Database Management System. See also RDBMS. See Technologies page.

DNS  Stands for Domain Name Server; special-function server application on the World Wide Web that translates URLs expressed as names into specific physical, numeric IP (Internet Protocol) addresses on specific hosts for purposes of routing access requests. Translation of Web site addresses (e.g., www.whamtech.com) into numeric IP addresses is called DNS resolution.

ETL  Acronym for database operations of Extract, Transfer, and Load, representing processing overhead required to copy data from an external DBMS or file. Operations performed entirely within a given DBMS require no ETL and therefore are more efficient.

field  Labeled storage location for data values within a database. A group of different fields that all describe a single entity - such as a person, company, or transaction - constitutes a data record. See schema.

GIS  Stands for Geographic Information System; IT infrastructure that supports seismic exploration applications.

hypercube functionality  Ability to perform operations in four dimensions (4D); typically, x, y, z and t (time).

integer Collection A WhamTech VLDB technology result set consisting of record numbers in the database, represented as a list of integers, or whole numbers. See Technologies.

IP  Stands for Internet Protocol. See TCP/IP and DNS.

join  Operation within a RDBMS by which data from two tables are combined to form a third, virtual table upon which further operations can be performed; one category of SQL commands.

load balancing  Can mean two things:

1. Distributing load across multiple servers.

2. Within a DBMS, the process of optimizing the ratio of queries (user-initiated requests for data) to operations needed to maintain and update the database. In the case of 2, WhamTech VLDB products require almost zero load balancing.

node  Within a tree data structure, the point at which a branching occurs to form multiple subordinate nodes, or leaves. The single node from which the branches of a tree grow is called its root. A node is also called a parent, and the nodes at the ends of its branches are its children. See also balanced binary tree.

ODBC  Stands for Open Database Connectivity, a multi-platform DBMS interface built to execute SQL.

OLAP Acronym for Online Analytical Processing, describing data mining within a DBMS.

profiling  Real-time tailoring of displays, particularly Web pages, to an identified set of customer characteristics, such as probable preferences based on demographics.

RAD  Stands for Rapid Application Development; programming tools, including 5GLs, that greatly reduce the amount of work effort required to generate new program code.

RDBMS  Relational Database Management System. See also DBMS.

relational  Describing tables or files linked to one another through a similarity relationship, which could be one-to-one, one-to-many, many-to-one, or many-to-many. For example, Customer_ ID could be used to describe a customer and link to a sales item that the same customer bought.

relevancy  In DBMS terminology, the degree to which search results meet the requirements or expectations implicit in the query.

rid list  Post-search process which rejects certain records (rid list) and/or includes others (result set) according to specific criteria.

rotation  Database tree index maintenance task requiring computational overhead, to keep tree indexes in balance; rebalancing. See also balanced binary tree.

schema  Within a RDBMS, the structure of tables and fields.

SDAT  Stands for Seismic Data Access Technology.

spider  Or robot, a program that traverses networks, such as the Web, and triggers document downloads to parse and index content for searching.

SQL  Structured Query Language is a set of commands used to conduct searches and perform operations on tables within relational databases. SQL exists in many implementations for many different DBMS platforms, including WhamTech's products.

table  Data structure composed of columns and rows. When tables are used to represent data records, typically, the first row of column headings contains field names and each row holds one data record. See schema.

TCO  Stands for Total Cost of Ownership. WhamTech products enable the lowest TCO of  available alternatives for high-performance VLDB applications, up to 90% less than other RDBMSs. See Benefits.

TCP/IP  Stands for Transport Control Protocol / Internet Protocol; the addressing scheme that defines the Internet.

URL  Stands for Universal Resource Locator; a standardized name addressing scheme and syntax for locating files and pages on the Internet. Example: http://www.whamtech.com/.

VLDB  Stands for Very Large Database, typically referring to databases in the hundreds of gigabytes to terabyte range and  many hundreds of millions to billions of records. See Technologies.

XML  Stands for eXtensible Markup Language; an enhancement of Hypertext Markup Language (HTML), the coding scheme used to create Web pages. XML also defines protocols for connection of pages to databases, as well as exchanging and presenting data between systems and across networks, among other uses.

Send To Printer

    

Copyright © 1998 - 2008 WhamTech, Inc www.whamtech.com 972-380-4645 info@whamtech.com
U.S. Patents Pending