It is a centralized location where the data from several sources are integrated. The data gathered here is used in several combinations from different streams of the business for improved planning and critical business decisions.
Where is it used?
You know that a data warehouse is a core component of Business intelligence. It is also called an Enterprise data warehouse (EDW). It is used for reporting and analysis. It stores old data and also uses real-time data to generate business reports.
Below are the familiar sectors where the data warehouse used.
- Public region: In this area, the data warehouse is used for collecting intelligence in government offices. It is also used for monitoring and analyzing the health records, tax records of each individual in government offices.
- Bank sector: It helps the banking sector to control and investigate the available resources on desks.
- Hospitality Industries like hotels, restaurants: In this sector warehouse is used for promoting themselves and to attract target customers.
- Health care: In this area, the warehouse helps to generate patient treatment reports.
- Airlines: Here warehouse is used for analyzing the works assigned to the airline crew.
- Insurance: In this sector, the warehouse helps to trace the market fluctuations.
How can a data warehouse benefit an organization?
Subject oriented: A specific business purpose can be analyzed with the data collected from here. If the business wants to understand the machine downtime and how it can be reduced then data can be collected from the data warehouse to understand the various times or situations during which the machines stopped working, the reasons behind the same, and how this can be reduced.
Integrated: Data from different sources are integrated to provide collective data. For instance, if a company wants to do budgeting for the next quarter, a data warehouse will have all the information required.
From incurred costs to depreciation costs, the entire set of data is available in one single source.
Time-variant: The historical data stored in the system can be utilized by the company at any time to extract relevant reports and understand the overall organization’s health.
But data such as the employee database which includes addresses, phone numbers must not be included as they are subjected to change.
Non-volatile: Once data is entered it remains the same. It must be ensured by the firm that data is highly protected and there is no change for alteration. If there are any modifications made, then it will affect the reports and analysis.
Improved data quality: Helps to improve data quality by providing consistent, accurate data and fixing bad data.
Cost v/s Benefit: Data warehouse is an IT project and it consumes more man-hours and more money from the budget. Its implementation and maintenance are very expensive.
Hence the cost to benefit ratio is very low. If the organization is small and medium, it may affect the revenue of the organization.
Data ownership: We know that basically, data warehouses are software applications for service. The main concern of it is the security of data. You have to be more sure about the people who handle and analyze the customer data are the employees that your company trusts.
Because leaking of the customer personal data within the organization may cause problems for executives and also affect the relationship between the company and the customer.
Data Rigidity: The data that is imported into the data warehouse is often static data sets that have less flexibility. They have less ability to generate a particular solution.
Warehouses are subjected to ad hoc queries that are highly difficult due to their least processing and query speed.
Miscalculation of ETL processing time: The entire process of data warehouse development, that is extraction, cleaning, and loading of consolidated data into the warehouse takes more time.
But usually, organizations do not guess the time required for the ETL process. It leads to a backlog of works in the organization.
Levels of data warehouse architecure
It comprises of several levels. Few of them are as mentioned below:
- Data Source Layer
- Data Extraction Layer
- Staging Area
- ETL Layer
- Data Storage Layer
- Data Logic Layer
- Data Presentation Layer
- Metadata Layer
- System Operations Layer
Types of data warehouse architecture
Mainly 3 types
Single tier architecture: It is rarely used architecture. It reduces the amount of data stored by avoiding repetition. In this type of architecture, only the source layer is available. The single-tier consists of the source layer, data warehouse layer, and analysis layer.
Two-tier architecture: It consists of a data staging area or ETL (extraction, transformation, and loading)along with the source layer. This layer helps to merge diversified data into one standard schema. This type of architecture consists of the source layer, data staging layer, data warehouse layer, and analysis layer.
Three-tier architecture: In this architecture contains reconciled layer along with the data staging and source layer. In this architecture, the source layer contains multiple sources and the data warehouse layer contains both data warehouses and data mart. The role of a reconciled layer is to generate a standard data model for the entire enterprise. This reconciled layer can also use to do some operational works like reporting. This architecture consists of the source layer, data staging area, reconciled layer, data warehouse layer, and analysis layer.
Types of data warehouse
Following three are main types of data warehouse
1. Enterprise Data Warehouse (EDW): It helps to provide decision support service throughout the enterprise and also helps to classify data according to the subject.
2. Operational Data Store: It helps to store records of employees.
3. Data Mart: It helps to collect data directly from sources.
Data warehouse tools
Following are the few popular tools of data warehouse
- Amazon Redshift
- Microsoft Azure
- CData Sync
- SAP HANA
- Amazon RDS
- Amazon S3
- Maria DB
Difference between database(DB) and data warehouse(DW)
Many people get confused between these two concepts. So here I am going to state the differences.
- DW transfers and stores accumulated data for analytical purposes. Whereas DB collects data for multiple transactions.
- DW developed for accumulation and recapture of the large data sets. But DB developed for write or read access.
- DW made for easier analysis of data collected and stored from multiple databases. DB made for quick record and recapture data.
Get more definitions about data warehouse and other ERP related terms here.