Analytics Platform System (APS) is Microsoft’s massively parallel processing (MPP) data warehouse technology. This has only been available as an on-prem solution (see video Overview of Microsoft Analytics Platform System). Until now. At the recent Microsoft Build Developer Conference, Executive Vice President Scott Guthrie announced the Azure SQL Data Warehouse (SQL DW). This is a cloud data warehouse-as-a-service (DWaaS) that will compete with Amazon’s Redshift.
But it has some additional benefits over Redshift:
- With Redshift you must scale your data warehouse by increasing both the compute and storage units. With SQL DW, compute and storage is decoupled so you can scale them individually. This is a very different economic model that can save customers a lot of money as you don’t have to purchase additional storage when you just need more compute power, or vice-versa
- The ability to pause compute when not in use so you only pay for storage, as opposed to Redshift in which you are billed 24/7 for all the VM’s that make up the nodes in your cluster
- With Redshift you have to pick a pre-defined size and it can take hours to days to resize. With SQL DW you can start small and grow or shrink in seconds
- And keep in mind with Microsoft you can have a hybrid architecture that can use an on-prem APS combined with a SQL DW, allowing you to keep sensitive data on-prem and non-sensitive data in the cloud if you wish. With Redshift you only have the option of keeping all your data in the cloud
- There is also a lot more compatibility with SQL DW as it supports many features that Redshift does not, such as indexes, stored procs, SQL UDFs, partitioning, and constraints
SQL DW is built with the same technology as APS, except that instead of using SQL Server 2014 it uses version 12 of Azure SQL Database. It also includes PolyBase. PolyBase allows APS and SQL DW to query data in a Hadoop cluster, either directly or by pushing some of the work to Hadoop itself so the query is actually run using the Hadoop clusters CPU’s. The Hadoop data is made to look as if it were local to the data warehouse, so that end-users can use their existing skill sets to query it via SQL or any reporting tool that using SQL (like Excel, SSRS, Power BI, etc). PolyBase can integrate with Hadoop in this manner via a Microsoft HDInsight cluster that can either be inside APS or in the cloud, or via a Hortonworks or Cloudera cluster.
SQL DW will work with existing data tools including Power BI for data visualization, Azure Machine Learning for advanced analytics, Azure Data Factory for data orchestration and Azure HDInsight.
The preview for Azure SQL Data Warehouse will be available later this calendar year. You can sign up to be notified when the Azure SQL Data Warehouse preview becomes available.
I will have a lot more blogs about this new service in the coming months, so stay tuned!
Short introduction video SQL Data Warehouse – YouTube
Video on new SQL Server 2016 features and SQL DW (minute 59 with demo): The SQL Server Evolution