As defined by Bill Inmon, “A Data Warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision making process.”
It is a centralized location where the data from several sources are integrated. The data gathered here is used in several combinations from different streams of the business for improved planning and critical business decisions.
Subject oriented: A specific business purpose can be analyzed with the data collected from here. If the business wants to understand the machine downtime and how it can be reduced then data can be collected from the data warehouse to understand the various times or situations during which the machines stopped working, the reasons behind the same and how this can be reduced.
Integrated: Data from different sources are integrated to provide collective data. For instance, if a company wants to do budgeting for the next quarter, a data warehouse will have all the information required. From incurred costs to depreciation costs, entire set of data is available in one single source.
Time-variant: The historical data stored in the system which can be utilized by the company at any time to extract relevant reports and understand the overall organization health. But data such as employee database which includes addresses, phone numbers must not be included as they are subjected to change.
Non-volatile: Once data is entered it remains the same. It must be ensured by the firm that data is highly protected and there is no change for alteration. If there are any modifications made, then it will affect the reports and analysis.
A data warehouse comprises of several levels. Few of them are as mentioned below:
- Data Source Layer
- Data Extraction Layer
- Staging Area
- ETL Layer
- Data Storage Layer
- Data Logic Layer
- Data Presentation Layer
- Metadata Layer
- System Operations Layer
Get more definitions about data warehouse and other ERP related terms here.