Advantages and disadvantages of data warehouses

Data warehouses are the traditional solution for data integration, and for good reason, but this is becoming increasingly difficult to scale and copy data from multiple data sources in multiple organizations in multiple locations

Data is extracted, transformed from multiple data sources and loaded (ETL) into a separate database, called a data warehouse, which operate as illustrated in the following diagram DW1:


DW1: Data warehouse solution

Advantages

Data warehouses tend to have a high query success, as they have complete control over the four main areas of data management systems:

  • Clean data

  • Indexes: multiple types

  • Query processing: multiple options

  • Security: data and access

Disadvantages

However, there are considerable disadvantages involved in moving data from multiple, often highly disparate, data sources to one data warehouse that translate into long implementation time, high cost, lack of flexibility, dated information and limited capabilities:

  • Major data schema transforms from each of the data sources to one schema in the data warehouse, which can represent more than 50% of the total data warehouse effort

  • Data owners lose control over their data, raising ownership (responsibility and accountability), security and privacy issues

  • Long initial implementation time and associated high cost

  • Adding new data sources takes time and associated high cost

  • Limited flexibility of use and types of users - requires multiple separate data marts for multiple uses and types of users

  • Typically, data is static and dated

  • Typically, no data drill-down capabilities

  • Difficult to accommodate changes in data types and ranges, data source schema, indexes and queries

  • Typically, cannot actively monitor changes in data

 
 Print