December 12th, 2018

There will always be a need to define data objects (aka entities and their attributes) and the relationships between data objects. These could be defined and retained in workflows/applications, as with object-oriented programming code, where all the business logic resides instead of in a data model. However, this prevents knowledge retained in programming code from being made available to other workflows/applications through a common data model.

Different data models serve different purposes, such as:

  • TRANSACTION DATABASES to efficiently manage and store operational data; usually less normalized/idealized.
  • DATA WAREHOUSES to integrate multiple data sources, including operational data, using extract, transform and load (ETL) for enterprise knowledge; usually more normalized/idealized.
  • DATA MARTS to improve reporting, BI and analytics performance using ETL data from a data warehouse; usually limited and have an entity-specific focus, e.g., customer/patient/employee and based on a star-schema or denormalized/flat.

High performance data warehouses can eliminate the need to ETL data to data marts for reporting, BI and analytics, but these tend to be expensive. Cloud-based data warehouses are another potentially less expensive option. However, there is still the need for expensive ETL processes to copy data from multiple disparate operational data models into a more general and usually more normalized/idealized single data warehouse data model.

The ideal data models for knowledge capture and storage are fifth normal form/semantic/ontology-based, where data objects are very granular/specific. The relationships are captured/defined between data objects as, at a higher level, these not only represent the function or business companies are in, but also, at a lower level, represent knowledge that can be used for any application, including operations, reporting, BI and analytics. Even so, ideal data models can be computationally expensive or too low performance for most applications.

The alternative is to essentially eliminate data models and flatten all data into a Big Table, and infer relationships by placing data in the same row, but that can be misleading. While data objects are related, there may be more than one relationship between entities and the relationship type(s) are unknown. If there is more than one relationship, the Big Table can be expanded further by adding rows to accommodate every possible relationship variation, leading to large growth in pre-compute operations and replicated data. These Big Table views are highly denormalized and highly inefficient for storage and updates, but simple and highly efficient for query processing.

In reality, multiple different views of data are needed for different applications. WhamTech SmartData Fabric® leaves data where it resides and can layer at least four virtual views on top: (i) Big Table-like Standard Data Views, (ii) REST API data objects, (iii) Business Objects and (iv) graph database. Other views can be enabled as needed, e.g., hierarchical. Physically, SmartData Fabric® does not store data, per se, but uses advanced indexes to process queries and virtually accesses data in sources, whether in the original source, or a copy in a data lake or similar repository, regardless of the source data models, types or formats. WhamTech combines content, link/relationship and master data indexes to provide these complete and multiple views of data.

About WhamTech
WhamTech, Inc. is a software company whose mission is to develop security-centric distributed virtual data, master data and graph data management, and analytics technology software products. WhamTech develops these products to anticipate, meet and exceed the demands of customers seeking an alternative to the conventional approaches of data warehouses, federated data access with conventional adapters and enterprise search. WhamTech’s goal is to provide an improved and more seamless way to work with data, by leaving it where it resides, and change the way fundamental and advanced data management is addressed. For more information, visit WhamTech at www.whamtech.com or follow @WhamTech_Inc on Twitter.

gavinrobertson-whamtech-cto

By: Gavin Robertson, WhamTech Senior Vice President and CTO

Contact
Gavin Robertson, 972-991-5700 ext. 706
gavin.robertson@whamtech.com