Junk dimensions

Junk dimensions are dimensions that contain miscellaneous data such as flags and indicators.  When designing a data warehouse, you might come across a source system that has a bunch of yes/no indicator fields.  If those fields needs to be tracked in a fact table, the result could be many small dimension tables (each with just a few rows) along with much more information stored in the fact table, causing performance issues.

Instead, use a junk dimension that holds all the unique combinations of those indicator fields into a single dimension and assigns a unique key.  This key is what is stored in the fact table.  So you will have only one additional dimension table and will reduce the number of fields in the fact table.  A key consideration when forming junk dimensions is how many combinations exist.  If the number of combinations is too high the junk dimensions size may be unmanageable, in which case you might want to have more than one junk dimension.

More info:

Kimball Design Tip #48: De-Clutter With Junk (Dimensions)

Design Tip #113 Creating, Using, and Maintaining Junk Dimensions

Data Warehousing: Junk Dimensions

Mystery or Junk data warehouse dimensions

Junk Dimension

Junk Dimensions with no Loading Needed

Dimensional Modeling: Junk vs Degenerate

About James Serra

James is a big data and data warehousing solution architect at Microsoft. Previously he was an independent consultant working as a Data Warehouse/Business Intelligence architect and developer. He is a prior SQL Server MVP with over 25 years of IT experience.
This entry was posted in Dimensions, SSAS. Bookmark the permalink.

One Response to Junk dimensions

  1. Pingback: Preparation for the 70-467 SQL BI exam | x86x64