The need for having both a DW and cubes

I have heard some people say if you have a data warehouse, there is no need for cubes (when I say “cubes” I am referring to tabular and multidimensional OLAP models).  And I have heard others say if you have OLAP cubes, you don’t need a data warehouse.  I strongly disagree with both these statements, as almost all the customers I see that are building a modern data warehouse use both in their solutions.  Here are some reasons for both:

Why have a data warehouse if you can just use a cube?

  • Breaking down complex steps so easier to build cube
  • Cube is departmental view (cube builder not thinking enterprise solution)
  • Easier to clean/join/master data in DW
  • Processing cube is slow against sources
  • One place to control data for consistency and have one version of the truth
  • Use by tools that need relational format
  • Cube does not have all data
  • Cube may be behind in data updates (needs processing)
  • DW is place to integrate data
  • Risk of having multiple cubes doing same thing
  • DW keeps historical records
  • Easier to create data marts from DW

Reasons to report off cubes instead of the data warehouse (a summary from my prior blog post of Why use a SSAS cube?):

  • Semantic layer
  • Handle many concurrent users
  • Aggregating data for performance
  • Multidimensional analysis
  • No joins or relationships
  • Hierarchies, KPI’s
  • Row-level Security
  • Advanced time-calculations
  • Slowly Changing Dimensions (SCD)
  • Required for some reporting tools

The typical architecture I see looks like this:

About James Serra

James is a big data and data warehousing solution architect at Microsoft. Previously he was an independent consultant working as a Data Warehouse/Business Intelligence architect and developer. He is a prior SQL Server MVP with over 25 years of IT experience.
This entry was posted in Azure Analysis Services, Data warehouse, SQLServerPedia Syndication. Bookmark the permalink.

10 Responses to The need for having both a DW and cubes

  1. Pingback: #BI101: The need for having both a DW and cubes – Welcome to Ray Kim's 'blog

  2. Simon says:

    Hi James, what would be the limitations of using normal Azure SQL (obviously without polybase) instead of DW?

    DW isn’t cost effective for smaller orgs.

    • James Serra says:

      Hi Simon,

      If you database is “small” (under 4TB) and you don’t need the power of a MPP solution, then go with Azure SQL Database. And with Managed Instance in public preview and near 100% SQL compatibility, it’s a great option with it’s biggest limitation is that SQL DW can be 20x-50x times faster.

      • Simon Nuss says:

        Awesome, done! Preview request is pending.

        Azure SQL Database Managed Instance looks like it on parity with SQL Server 2017 but, once again, it excludes Polybase! Why does MS keep excluding it?

        They should just replace polybase with Power Query imho. It can federate equally to Hadoop, plus do a million more things.

  3. Doug Burke says:

    Excellent commentary;

    Let the DW ‘house’ the data and OLAP interpret the data.

    Doug

  4. James, I’d like to see more exploration into the use of ROLAP, now that the backend databases are getting more sophisticated/faster. If your onsite DW can be supported using SSD drives, ROLAP may offer a faster SSAS cube build out, with only a single data update to the DW. Or online via Azure SQL data warehouse, with MPP, that may offer a great alternative. That provides both a traditional DW with a rapid SSAS build out. And I am thinking multi-dimensional, not tabular. I may be that last holdout, but multi-dimensional still seems far superior when the business needs get complex.

  5. Raghu says:

    James,
    I see that now we can build AAS TAbular Cubes on top of ADL. In this case what are the scenario you see SQL DW or SQL Azure is needed to provide a Semantic layer with AAS ?

    https://azure.microsoft.com/en-us/blog/using-azure-analysis-services-with-azure-data-lake-storage/

    • James Serra says:

      Hi Raghu,

      Those reasons where in the blog under “Why have a data warehouse if you can just use a cube?”. And I’ll add that if you are going against ADL, you will need to make sure the data is cleaned and transformed in the data lake before pushing it to a cube.

  6. James O'Toole says:

    Thanks for a great commentary. I now have a second source for some of the comments I have been espousing at work. Great graphic!! It clarifies a lot for many.
    colleagues.

  7. Pingback: Power BI: Dataflows | James Serra's Blog