Data Warehouse Design
Dimensional Design
Dimensional design, also known as dimensional modeling, is a design technique used in data warehousing. It is a method of structuring data so as to make it easily understandable and accessible. The main goal of dimensional design is to present data in a standard, intuitive framework that allows for high-performance access.
A dimensional model consists of "fact" and "dimension" tables.
Fact Tables: Fact tables store the quantitative or factual data that a business tracks. This could be sales revenue, units sold, clicks on a webpage, etc. Each row in the fact table represents a specific event or transaction. Fact tables often contain a large number of rows and are associated with measures or numerical values.
Dimension Tables: Dimension tables describe the dimensions or context of the facts. For example, if the fact is a sale, the dimensions could include details about the product sold, the customer who bought it, the time and location of the sale, and so on. Each dimension table contains a set of attributes (or fields) that provide descriptive detail about the dimension.
A common approach in dimensional design is the star schema, in which a fact table sits at the center of the design (like a star), surrounded by dimension tables. Each fact table row has a foreign key column for each related dimension table, and each foreign key refers to a primary key in the respective dimension table.
For instance, in a sales data warehouse, a fact table might include facts like units sold and total revenue, and it might be linked via foreign keys to dimensions such as date, product, store, and customer.
There is also a slightly more complex version of dimensional design called a snowflake schema. In a snowflake schema, dimension tables can be normalized into multiple related tables. The result is a structure that resembles a snowflake. Normalization reduces redundancy but adds a bit of complexity.
The goal of dimensional design is to create a database that is easy to understand and yields predictable performance. The model should be flexible to handle changes and should deliver fast query responses. Dimensional design is a popular choice for data warehouses because it can provide a high degree of data consistency, simplicity, and usability.
Dimension Hierarchies
A dimension hierarchy is a way of organizing a dimension from the highest level to the most granular detail. It allows for drilling up and down to view data at various levels of granularity.
Dimension hierarchies are used to define relationships between various attributes within a dimension, creating a parent-child relationship. This hierarchical structure allows for a more efficient analysis and reporting, enabling users to drill down from higher-level data to lower-level data for more detailed insights, or drill up for an aggregated view.
For instance, consider a "Time" dimension in a retail sales data warehouse. A possible hierarchy could be:
Year
Quarter
Month
Week
Day
In this hierarchy, "Year" is the highest level, and "Day" is the lowest, most detailed level. If a user wants to examine yearly sales data, they would start at the "Year" level of the hierarchy. If they want to analyze sales trends in more detail, they could drill down to the "Quarter" or "Month" level, and so on, all the way down to the "Day" level.
Another example could be a "Geography" dimension in the same retail sales data warehouse. A possible hierarchy could be:
Country
Region
State/Province
City
Store
In this hierarchy, a user could start by looking at sales data at the Country level, then drill down to see data by Region, State/Province, City, and individual Store.
Hierarchies are important because they allow users to navigate data in a meaningful way, support roll-up operations for data aggregation, and offer a logical querying path.
In a star schema, hierarchies for a dimension are stored in a single table, with one column for each level in the hierarchy. In a snowflake schema, hierarchies are represented through multiple related tables, one for each level in the hierarchy.