In a perfect world a data warehouse would always return all queries very quickly, no matter what the size or complexity of the data warehouse. But the reality is that sometimes we won’t get the performance we need from the data warehouse, and the solution is to use a SQL Server Analysis Services (SSAS) cube, which also has other benefits (see Why use a SSAS cube?). Thanks to Azure, creating a cube can be done in a matter of hours, giving you quick results for just about any query no matter how big the data warehouse gets.
If you want to create a cube quickly you can do so by creating a VM in Azure and installing SQL Server and SSAS (or choosing a VM from the marketplace with SQL Server and SSAS already installed) and assuming the following restrictions:
- No high-availability needed (can be accomplished in the future by creating another VM in an availability set and syncing cubes and using a Azure load balancer)
- No disaster recovery needed (can be accomplished in the future by creating another VM in another region and syncing cubes). Solution for now would be to backup the cube and restore if needed, or just rebuild and reprocess the cube
- Number of concurrent connections under a hundred. To handle more connections in the future can be accomplished by scaling up the VM and/or creating multiple VMs and syncing cubes and adding a load balancer (keeping in mind VMs can be shut-down off hours to save cost). The number of concurrent connections is highly variable based on the type of querying and caching. See SQL Server Best Practices Article
- Using SSAS tabular model, and the size of the cube is less than 180GB. Cube compression is between 10-100x and the 180GB number is with a conservative average of 10x and using 60% of the servers memory. A larger cube can be accomplished via a SSAS multidimensional model as it does not require the cube in memory like Tabular. The multidimensional SSAS cube can use HOLAP storage mode to support more source data, as the cube will not have a copy of the detail records from the source system, greatly reducing cube size and processing time. Drilling down to the detail records can still be made by the cube which will accomplish this by querying the source system
- Available maintenance window at night to process the tabular cube (in the future can add a “cube processing VM” to process cube and then sync the cubes to avoid queries hitting a cube when it is being processed and slowing the queries down)
- Only need to update the cube once a day (in the future can use VM to process cube and sync as well as use partitioning)
- Future expansion can be via Azure Analysis Services (in preview) as well as creating multiple cubes by subject areas and using a true scale-out architecture
- It’s a balance between using extra developer hours to save cost and choosing more hardware up-front
To choose the appropriate size VM, see Hardware Sizing a Tabular Solution (SQL Server Analysis Services) and Sizes for virtual machines in Azure and Largest VM in the Cloud and How Much RAM Do I Need For My SSAS Tabular Server?