The process to build and maintain Link Indexes™ can run in parallel with building and maintaining content indexes or after content indexes are built.

Most link analysis applications have the following limitations:

  • Scalability: Data is moved in entirety into a single database
  • Federated access to data sources using conventional adapters with associated poor results due to data quality
  • Difficult to combine structured and unstructured data
  • No near-real-time updates
  • No fuzzy matching, probabilities or favorability/threat scores

WhamTech developed an option that works in conjunction with its federated data access solution based on EIQ SuperAdapters™ to overcome the above-mentioned limitations.  This option takes advantage of content indexes and allows the capture of links in near-real-time among the same or similar entities in specialized indexes, called Link Indexes™.  Similar to content indexes, Link Indexes™ are created and maintained at the data source level, but not on data source systems, and can scale through distributed parallel processing across multiple disparate data sources.  The combination of content indexes and Link Indexes™ enable virtual views of entity data and links between entity data.  User-driven, interactive visualization displays only entity data and links of interest, as needed. Link Indexes™ capture the following five types of links for structured data sources:

  • Internal data source, multiple table joins using primary key (PK) - foreign key (FK) relationships
  • Internal data source, single table self-joins using PK-FK relationships
  • Internal data source, single table self-joins using same or similar data
  • Internal data source, multiple table joins using same or similar data
  • External data source, multiple table joins using same or similar data

...and two types for unstructured data in either structured or unstructured sources:

  • Structured data captured through entity extraction using same or similar data
  • Unstructured text search using same or similar data

For all the above types of links, EIQ Products™ can be selective about the entities used to establish links, as it may not prove of value to link all entities (types and data).  Also, EIQ Products™ can apply fuzzy matching using a product from a third-party vendor for structured data and text matching algorithms for unstructured data.  Probabilities of match can be calculated and stored with links, or calculated on-the-fly as links are analyzed.  Plus, given threat/favorability scores for specific entities combined with probabilities, threat/favorability link analysis networks can be displayed.  WhamTech recognized through its work on Web search engines that all networks and relationships can be represented through a combination of links (one-to-one).  WhamTech uses its binary tree indexes to capture and maintain the links as link maps (one-to-many) in Link Indexes™, and Boolean operations on bitmap representations to combine link maps for link analysis (many-to-many). In their basic form, Link Indexes™ are join indexes, which are pre-formed joins - both internal and external to data sources.

Link Indexes™ can significantly cut down on the computing time and resources needed to execute queries involving joins. More complex queries, like nested selects, can take advantage of Link Indexes™ to execute n degrees of separation queries, again, without much computing time and resources needed. The more obvious use of Link Indexes™ is for link analysis applications, where one-to-many, many-to-one and many-to-many relationships between predefined entities, can be visually represented and subjected to social network analysis. As EIQ Products™ also provide federated data access in conjunction with Link Indexes™, whereby an analyst can interact with data sources, without being aware of it, through link analysis visualization. EIQ Products™ can work with both commercial and open source link visualization tools. EIQ Products™ keep a log of all interactions within link analysis for subsequent use in legal proceedings or for probable cause.  Analysts can retain link analysis networks and resume/retrieve them in subsequent sessions. More available soon on WhamTech Virtual Link Mapping and Virtual Link Analysis solutions in a separate document.

More information on WhamTech products, click here.

SmartData Fabric®:  UNLEASH the value of data.

For more related information, please visit the pages listed below.