Solutions Home Contact Search Site
Company
Solutions
Products
Benefits
Technologies
Documents
Customers
Articles
Demos

Solutions

WhamTech EIQ Products™ provide the virtual data access, integration, sharing and interoperability capabilities for a number of solutions.  In many cases, EIQ Products enable solutions that would either be too difficult or too expensive using conventional approaches.  Organizations are awash with data management challenges and almost none are exempt from the potential benefits that EIQ Products offer.  Most solutions combine EIQ Product capabilities and can be broadly categorized as follows:

bullet

Federated Data Access, Information Sharing, Virtual Operational Data Stores and Cloud Data Services

bullet

Virtual Data Warehouses and Virtual Data Marts

bullet

Virtual and Hybrid Master Data Management

bullet

Virtual Access to Mainframe Data Files

bullet

Virtual Link Mapping and Virtual Link Analysis

bullet

Living Networks

bullet

eDiscovery

The above solutions are expanded on below:

Federated Data Access, Information Sharing, Virtual Operational Data Stores and Cloud Data Services

Basic EIQ Products configuration provides cleansed, transformed and standardized indexes, and high quality query processing, without additional loads on data source systems.  EIQ Products overcome one of the major disadvantages of conventional federated adapters - access to data when data source systems are unavailable.  EIQ Products can optionally retrieve and assemble query results data by inverting indexes.

Typically, middleware or other applications manage the interaction with adapter products through schema mapping to reference data models.  EIQ Products are no different in that respect.  WhamTech also offers a sub-middleware/simple middleware product called EIQ Federated Server to manage the interaction with, and among, EIQ Products, including other EIQ Federated Servers. 

Physical operational data stores (ODSs) are typically copies of operational/transactional data in original data formats and schemas, and stored separately from operational/transactional systems.  ODSs are typically used to archive data for regulatory purposes and as a source for data warehouses, which in-turn are sources for data marts.  The two major advantages that EIQ Product-based virtual ODSs have over physical ODSs are:

  1. DATA QUALITY - EIQ Products can clean, transform and standardize data in indexes and result sets - the original data remains untouched, thus allowing regulatory compliance, but also achieving high quality queries and results.

  2. UNIVERSAL ACCESS SCHEMA - EIQ Products can present a reference data model to an application regardless of how the indexes (or data sources) are physically configured.  Indexes can be mapped to more than one reference data model.

Cloud data services make data available regardless of its location.  EIQ Products have a universal access schema that removes the need to directly specify individual data sources or their location.  Data access occurs in the background or in the cloud.  Various rules reflecting preferences can be imposed on results data, including de-duplication, ranking and LIFO/FIFO in the case that data sources receive the same queries from a different EIQ Product.

Virtual Data Warehouses and Virtual Data Marts

Physical data warehouses (DWs) and data marts (DMs) have two additional major advancements on ODSs in general, whether virtual or physical (note that numbering is continued from above):

  1. DERIVED DATA - Pre-aggregated, pre-calculated, fuzzy, text and other indexes in addition to indexes that mirror data sources.

  2. ARCHIVED DATA – Historic data can be retained along with current data.

EIQ Products also have virtual (index) versions of the above two capabilities (3 and 4) enabling virtual DW and DM solutions that directly compare with physical DWs and DMs.  WhamTech virtual DW and DM solutions enable applications such as business intelligence, data mining and logistics that were thought to be the sole domain of physical DW and DM solutions.

For an interesting and controversial discussion on virtual data warehouses, read an article published by Information Management (also linked in the Articles section).

Virtual or Hybrid Master Data Management

Conventional federated adapters for master data management (MDM) are difficult to implement either partially or in full because of three main problems that centralized databases/data warehouses solve:

  1. Data in data sources is unclean and in many cases, unusable

  2. Data in data sources may not have indexes available for querying, or data sources cannot accommodate improved indexes such as fuzzy matching, address clean-up, name variation, and pre-aggregated and pre-calculated fields

  3. Query performance, where data sources are not capable of executing more advanced queries, queries are slow, or queries are blocked due to load considerations and operational/transactional activity on data sources

However, copying large amounts of data into centralized databases/data warehouses can be expensive and time-consuming to establish and maintain, and raises data-related responsibility, accountability, security, privacy and legal issues.  But perhaps, the most difficult problem is keeping everything up-to-date.

WhamTech has worked with several MDM vendors to design a hybrid solution whereby more frequently accessed master data is copied to a centralized database or MDM hub, and less frequently accessed data stays in data sources.  EIQ Products cleanse, transform and standardize data used to build indexes, and provide pointers as well as cross-references to data based on global IDs.  Identification of the same customer, vendor or product, for example, in multiple data sources can be performed when indexes are built or once master data is in the MDM hub.  EIQ Products fuzzy matching and link mapping can help in this process by finding similar data across multiple heterogeneous systems.  Finally, a decision has to be made whether the data source systems are responsible for updating the master data or the master data updates data sources; this is called harmonization.

More available soon on WhamTech Virtual or Hybrid Master Data Management solutions in a separate document.

Virtual Access to Mainframe Data Files

As a variation of the virtual ODS, WhamTech developed a solution to virtually access archived mainframe data files.  An example of the solution approach is a recently completed project for a large company to enable standard driver and SQL access to their archived IBM mainframe, Cobol-generated VMS data files:

WhamTech used the original archived very large data files as the data source and developed a VMS reader, a parser to build both hierarchical and relational indexes, and a light version of an ODBC driver.  SQL is submitted to an EIQ SuperAdapter that resolves queries and in-turn retrieves results data on the VMS data files.   The EIQ SuperAdapter uses internal file pointer data to access specific sections of the very large data files to avoid having to read the entire file to retrieve results data.  In this case, the customer company is in the process of moving all files from, and shutting down, the mainframe, with the goal of saving significant maintenance fees.

The solution could be extended to access live data files on mainframes similar to EIQ Products accessing other data sources.

Click here to read more on Virtual Access to Archived Mainframe Data Files.

Virtual Link Mapping and Virtual Link Analysis

Most link analysis applications have the following limitations:

bullet

Scalability: Data is moved in entirety into a single database

bullet

Federated access to data sources using conventional adapters with associated poor results due to data quality

bullet

Difficult to combine structured and unstructured data

bullet

No near-real-time updates

bullet

No fuzzy matching, probabilities or favorability/threat scores

WhamTech developed an option that works in conjunction with its federated data access solution based on EIQ SuperAdapters to overcome the above-mentioned limitations.  This option takes advantage of content indexes and allows the capture of links in near-real-time among the same or similar entities in specialized indexes, called Link Indexes™.  Similar to content indexes, Link Indexes are created and maintained at the data source level, but not on data source systems, and can scale through distributed parallel processing across multiple disparate data sources.  The combination of content indexes and Link Indexes enable virtual views of entity data and links between entity data.  User-driven, interactive visualization displays only entity data and links of interest, as needed.  Link Indexes capture the following five types of links for structured data sources:

bullet

Internal data source, multiple table joins using primary key (PK) - foreign key (FK) relationships

bullet

Internal data source, single table self-joins using PK-FK relationships

bullet

Internal data source, single table self-joins using same or similar data

bullet

Internal data source, multiple table joins using same or similar data

bullet

External data source, multiple table joins using same or similar data

...and two types for unstructured data in either structured or unstructured sources:

bullet

Structured data captured through entity extraction using same or similar data

bullet

Unstructured text search using same or similar data

For all the above types of links, EIQ Products can be selective about the entities used to establish links, as it may not prove of value to link all entities (types and data).  Also, EIQ Products can apply fuzzy matching using a product from a third-party vendor for structured data and text matching algorithms for unstructured data.  Probabilities of match can be calculated and stored with links, or calculated on-the-fly as links are analyzed.  Plus, given threat/favorability scores for specific entities combined with probabilities, threat/favorability link analysis networks can be displayed. 

WhamTech recognized through its work on Web search engines that all networks and relationships can be represented through a combination of links (one-to-one).  WhamTech uses its binary tree indexes to capture and maintain the links as link maps (one-to-many) in Link Indexes, and Boolean operations on bitmap representations to combine link maps for link analysis (many-to-many).

In their basic form, Link Indexes are join indexes, which are pre-formed joins - both internal and external to data sources.  Link Indexes can significantly cut down on the computing time and resources needed to execute queries involving joins.

More complex queries, like nested selects, can take advantage of Link Indexes to execute n degrees of separation queries, again, without much computing time and resources needed.

The more obvious use of Link Indexes is for link analysis applications, where one-to-many, many-to-one and many-to-many relationships between predefined entities, can be visually represented and subjected to social network analysis.

As EIQ Products also provide federated data access in conjunction with Link Indexes, whereby an analyst can interact with data sources, without being aware of it, through link analysis visualization.  EIQ Products can work with both commercial and open source link visualization tools.  EIQ Products keep a log of all interactions within link analysis for subsequent use in legal proceedings or for probable cause.  Analysts can retain link analysis networks and resume/retrieve them in subsequent sessions.

More available soon on WhamTech Virtual Link Mapping and Virtual Link Analysis solutions in a separate document.

Living Networks

Once an initial link analysis is performed, analysts typically save the file for subsequent retrieval, however, when retrieved, it contains dated information.  The options usually involve completely refreshing the link analysis or with more advanced systems, incrementally updating it with data from queries to a database or federated queries to multiple data sources.

Living Networks are an extension of WhamTech's virtual link mapping and link analysis that allows an analyst to subscribe to any updates occurring in near-real-time to a link analysis network.   These updates are for entities or links that are either in the network or are within n degrees of separation from the network.  When an update is identified for a particular retained network as a whole, in part, or specific entities and/or links, updates are automatically made available to the analyst for updating the network and/or the analyst is notified of the updates.

Living Networks are a culmination of most of the capabilities that WhamTech's EIQ Products provide and could significantly improve an analyst's ability to find, represent, monitor and present complex information.  This is particularly true when probabilities of both entities and links are combined with threat/favorability scores.

More available soon on WhamTech Living Network solutions in a separate document.

eDiscovery

WhamTech OEMs an  information geometry categorization tool from a large system integrator that uses it primarily with intelligence agencies.  The tool provides a powerful way to search for documents and emails that are relevant to a legal case.  This capability complements WhamTech's advanced text search, and link mapping and link analysis.  Together, these tools comprise WhamTech's eDiscovery tool, called Teracase. 

Most eDiscovery tools rely heavily on text search and some include concept searches with varying degrees of success.  WhamTech recognized the leading edge capabilities of the information geometry tool where it allows the end-user to quickly converge on a category model with relatively little input and time compared to other categorization tools.  The category model is used as a filter either on its own or combination with other filters for use against the entire set of e-mails and documents.

WhamTech's primary market is Early Case Assessment; both after a case had been identified and corporate preemption.  WhamTech's secondary market is full eDiscovery.

Other WhamTech solutions will benefit from either Teracase itself for information discovery or the information geometry categorization tool. 

Teracase version 1.3 Features
bullet

Simple user access

bullet

Supported data sources:
bullet

emails and attachments

bullet

Documents (text, compressed, PDF, Microsoft Office documents, Microsoft Works, HTML and variants, WordPerfect and others)

bulletDe-duplication using "same as" algorithm, "similar as" coming soon
bulletCase Management
bulletCategory Management
bulletSearch/Analysis with various filters (include/exclude, keywords with various options - stemming, synonym, phrase, etc., data source type, email filters - sent/received date-time, sender/recipient domain, sender/recipient email address, with/without attachments, document filters - type, last modified date-time, author, attachment/free-standing
bulletSave filters, apply saved filters
bulletResults with options (list of matching text body id, path-link to open, data source type, ordered by keyword relevancy and category model score, n results per page, next n/previous n, view email/document text, tag selected, mark for a category model)
bulletCreate, save, refine category models
bulletScore entire corpus for selected category models
bulletPredefined tags (Hot - highly relevant email/documents that are found early and will most likely be shown at trial, Responsive - maybe, Non-responsive - irrelevant and Privileged - these are files governed by attorney/client privilege and will be presented to a judge (if the matter goes to trial), but can initially be set aside for review by the lawyer
bulletUser-assigned tag to any email/document
bulletTeracase Administration
bulletAudit log - log of all user actions at the web server level

Near-future features:

bulletUser management and role assignment
bulletBilling Information - ingest volume size (before and after unroll and unzipped) and de-duplication by volume
bulletStatistics and reports
bulletLink mapping and link analysis for emails
bulletInteractive visualization for both emails and documents

More available soon on WhamTech eDiscovery solutions in a separate document.

Send To Printer

    

 

Home | Insider Login | Site Map
Copyright © 1998 - 2010 WhamTech, Inc • www.whamtech.com • 972-991-5700 • info@whamtech.com

U.S. Patents Pending