Azure SQL Data Warehouse

Analytics Platform System (APS) is Microsoft’s massively parallel processing (MPP) data warehouse technology.  This has only been available as an on-prem solution (see video Overview of Microsoft Analytics Platform System).  Until now.  At the recent Microsoft Build Developer Conference, Executive Vice President Scott Guthrie announced the Azure SQL Data Warehouse (SQL DW).  This is a cloud data warehouse-as-a-service (DWaaS) that will compete with Amazon’s Redshift.

But it has some additional benefits over Redshift:

  • With Redshift you must scale your data warehouse by increasing both the compute and storage units.  With SQL DW, compute and storage is decoupled so you can scale them individually.  This is a very different economic model that can save customers a lot of money as you don’t have to purchase additional storage when you just need more compute power, or vice-versa
  • The ability to pause compute when not in use so you only pay for storage, as opposed to Redshift in which you are billed 24/7 for all the VM’s that make up the nodes in your cluster
  • With Redshift you have to pick a pre-defined size and it can take hours to days to resize.  With SQL DW you can start small and grow or shrink in seconds
  • And keep in mind with Microsoft you can have a hybrid architecture that can use an on-prem APS combined with a SQL DW, allowing you to keep sensitive data on-prem and non-sensitive data in the cloud if you wish.  With Redshift you only have the option of keeping all your data in the cloud
  • There is also a lot more compatibility with SQL DW as it supports many features that Redshift does not, such as indexes, stored procs, SQL UDFs, partitioning, and constraints

SQL DW is built with the same technology as APS, except that instead of using SQL Server 2014 it uses version 12 of Azure SQL Database.  It also includes PolyBase.  PolyBase allows APS and SQL DW to query data in a Hadoop cluster, either directly or by pushing some of the work to Hadoop itself so the query is actually run using the Hadoop clusters CPU’s.  The Hadoop data is made to look as if it were local to the data warehouse, so that end-users can use their existing skill sets to query it via SQL or any reporting tool that using SQL (like Excel, SSRS, Power BI, etc).  PolyBase can integrate with Hadoop in this manner via a Microsoft HDInsight cluster that can either be inside APS or in the cloud, or via a Hortonworks or Cloudera cluster.

dpi-apr29-2

SQL DW will work with existing data tools including Power BI for data visualization, Azure Machine Learning for advanced analytics, Azure Data Factory for data orchestration and Azure HDInsight.

The preview for Azure SQL Data Warehouse will be available later this calendar year.  You can sign up to be notified when the Azure SQL Data Warehouse preview becomes available.

I will have a lot more blogs about this new service in the coming months, so stay tuned!

More info:

Microsoft BUILDs its cloud Big Data story

Microsoft Announces Azure SQL Database elastic database, Azure SQL Data Warehouse, Azure Data Lake

Introducing Azure SQL Data Warehouse

Short introduction video SQL Data Warehouse – YouTube

Top reasons why enterprises should choose Azure SQL Data Warehouse

Video on new SQL Server 2016 features and SQL DW (minute 59 with demo): The SQL Server Evolution

Video Microsoft Azure SQL Data Warehouse Overview

Video Azure SQL Data Warehouse: Deep Dive

About James Serra

James is a big data and data warehousing solution architect at Microsoft. Previously he was an independent consultant working as a Data Warehouse/Business Intelligence architect and developer. He is a prior SQL Server MVP with over 25 years of IT experience.
This entry was posted in Azure SQL DW, Data warehouse, PDW/APS, PolyBase, SQLServerPedia Syndication. Bookmark the permalink.

14 Responses to Azure SQL Data Warehouse

  1. Peter says:

    Hi James

    We are looking into using the azure DW as dev/test environment for our on-prem APS.
    Can tsql, ddl and ssis/pdw destination adapter be used directly without any rewrite?
    Thanks
    Peter

    • James Serra says:

      Hi Peter,

      I can’t say for certain yet, but there should be no changes needed in tsql and ddl. Not sure of the SSIS adaptors yet, but shoot me an email and I’ll find out for you.

  2. Linus says:

    Hi James,

    Couple of questions for you, maybe you’ve come across some information that can shed light on this 🙂

    1. I’m guessing that since Azure SQL Data Warehouse includes PolyBase, it is implied that it can simultaneously query traditional star-based warehouse data AND data residing in an Azure Data Lake, and combine it’s results and visualize it in e.g. PowerBI? That would be so sweet.

    2. Have you seen anything with regards to migrating an existing on premises MS SQL 2008 warehouse directly into Azure SQL Data Warehouse, and treat it as a known baseline for subsequent analytics?

    3. Also, what about combining traditional warehouse data, raw data in an azure lake, and ad-hoc data from third party APIs? I’d like to be able to perform ad-hoc analytics based on not only my own data in a warehouse and a lake, but also the data coming from multiple third party aggregators, but _not_ have to write custom API clients to copy the third party data into my lake first.

    Cheers,
    Linus

    • James Serra says:

      Hi Linus,

      1. PolyBase can query Azure SQL Data Warehouse, Hortonworks HDP, Cloudera CHD, and Azure HDInsight and the results combined in Power BI. It seems very likely that PolyBase can query Azure Data Lake but I have not heard for sure yet
      2. The migration from a data warehouse in MS SQL 2008 would be the same to either APS on-prem or Azure SQL Data Warehouse
      3. This sounds like a good future use case for PolyBase – being able to query data from third party APIs. Unfortunately it’s not there yet

  3. Vijay says:

    Hi James,
    Recently I learned fundamentals of Machine Learning on Microsoft Azure when I had 1 month free access.
    The question I wan to continue to use the platform for Learning but it sounds to be expensive for the subscription. Could you suggest me the best way to go about it. My aim is not make it expensive but I am ready to pay reasonable amount to use the platform.
    Thanks in advance
    Vijay

  4. Pingback: SQL Server 2016 | Garrett Edmondson

  5. Pingback: Copying data from Azure Blob Storage | James Serra's Blog

  6. Pingback: Getting data into Azure Blob Storage | James Serra's Blog

  7. Mehmet says:

    Hi James,

    How does Azure SQL DW compare to Azure SQL?

  8. Data Science says:

    Magnificent goods from you, man. I’ve have in mind your stuff previous to and
    you’re simply too great. I really like what you’ve got right here, certainly like what
    you are stating and the best way by which you assert it.
    You make it enjoyable and you still take care of to stay it smart.
    I can not wait to learn far more from you. That is really a terrific site.

  9. Pingback: Data loading into Azure SQL Data Warehouse - SQL Server - SQL Server - Toad World

  10. Pingback: Data loading into Azure SQL Data Warehouse | James Serra's Blog

  11. Sandeep says:

    Hi James,

    Can we consider Azure SQL DW as the cloud version of APS ? Recently in a MS led training , the instructor mentioned that it is NOT so but unable see any major difference between the two of them except cost and that APS has compute and storage coupled while it is de-coupled in Azure SQL DW. This would really help me to understand and then position the 2 accordingly.

    Thanks