Azure DevTest Labs

I have been working at Microsoft now for 3 years and 4 months (side note: it’s by far the best company I have ever worked for).  You would think by now I know about every Azure product, but we release new products and services at such a tremendously fast pace that almost weekly I discover something I did not know about.  Today was one of those days as I discovered Azure DevTest Labs, which was made generally available in May 2016 (it public previewed in November 2015).

Here is the overview:

Developers and testers are looking to solve the delays in creating and managing their environments by going to the cloud.  Azure solves the problem of environment delays and allows self-service within a new cost efficient structure.  However, developers and testers still need to spend considerable time configuring their self-served environments.  Also, decision makers are uncertain about how to leverage the cloud to maximize their cost savings without adding too much process overhead.

Azure DevTest Labs is a service that helps developers and testers quickly create environments in Azure while minimizing waste and controlling cost.  You can test the latest version of your application by quickly provisioning Windows and Linux environments using reusable templates and artifacts.  Easily integrate your deployment pipeline with DevTest Labs to provision on-demand environments.  Scale up your load testing by provisioning multiple test agents, and create pre-provisioned environments for training and demos.

Azure DevTest Labs addresses the problems in Dev/Test environments today majorly through four aspects:

  • Quickly be “ready to test” – DevTest Labs enables you to create pre-provisioned environments with everything your team needs to start developing and testing applications.  Simply claim the environments where the last good build of your application is installed and get working right away.  Or, use containers for even faster and leaner environment creation
  • Worry-free self-service – DevTest Labs makes it easier to control costs by allowing you to set policies on your lab – such as number of virtual machines (VM) per user and number of VMs per lab.  DevTest Labs also enables you to create policies to automatically shut down and start VMs
  • Create once, use everywhere – Capture and share environment templates and artifacts within your team or organization – all in source control – to create developer and test environments easily
  • Integrates with your existing toolchain – Leverage pre-made plug-ins or our API to provision Dev/Test environments directly from your preferred continuous integration (CI) tool, integrated development environment (IDE), or automated release pipeline. You can also use our comprehensive command-line tool

Jeff Gilbert’s TechNet blog has some great blogs on Azure DevTest Labs as well as Praveen Kumar Sreeram, and there are some excellent short videos by Microsoft to help you get started.

You’ll need a subscription that provides you monthly Azure credits to use DevTest labs.  Besides the pay-as-you-go option, there are free options and subscription options:

* MSDN Platforms is available exclusively through Microsoft Volume Licensing. For pricing and purchase details, contact your Microsoft account representative, Microsoft Partner, or an authorized volume licensing reseller.

More info:

Getting to know Azure DevTest Labs

How to Use Azure DevTest Labs for Test Environments and Dev Machines

Azure DevTest Labs.

More about Azure DevTest Labs

Posted in Azure, SQLServerPedia Syndication | 2 Comments

Data Science Virtual Machine

The Data Science Virtual Machine (DSVM) is a customized VM image on Microsoft’s Azure cloud built specifically for doing data science.  It has many popular data science and other tools pre-installed and pre-configured to jump-start building intelligent applications for advanced analytics.  So instead of you having to create a VM and download and install all these tools which can take many hours, within a matter of minutes you can be up and running.

The DSVM is designed and configured for working with a broad range of usage scenarios.  You can scale your environment up or down as your project needs change.  You are able to use your preferred language to program data science tasks.  You can install other tools and customize the system for your exact needs.

It is available on Windows Server 2012 (create), Windows Server 2016 (create) and on Linux – either Ubuntu 16.04 LTS (create) or on OpenLogic 7.2 CentOS-based Linux distributions (create).

The key scenarios for using the Data Science VM:

  • Preconfigured analytics desktop in the cloud
  • Data science training and education
  • On-demand elastic capacity for large-scale projects
  • Short-term experimentation and evaluation
  • Deep learning

The DSVM has many popular data science and deep learning tools already installed and configured.  It also includes tools that make it easy to work with various Azure data and analytics products.  You can explore and build predictive models on large-scale data sets using the Microsoft R Server or using SQL Server 2016 (note that R Server and SQL Server on the DSVM are not licensed for use on production data).  A host of other tools from the open source community and from Microsoft are also included, as well as sample code and notebooks.  See the full list here and see the latest new and upgraded tools here.

Finally, for Windows users check out Ten things you can do on the Data science Virtual Machine and for Linux users check out Data science on the Linux Data Science Virtual Machine.  For more information on how to run specific tools for Windows see Provision the Microsoft Data Science Virtual Machine and for Linux see Provision the Linux Data Science Virtual Machine.

More info:

Data Science Virtual Machine – A Walkthrough of end-to-end Analytics Scenarios (video)

Introduction to the cloud-based Data Science Virtual Machine for Linux and Windows

Introducing the new Data Science Virtual Machine on Windows Server 2016

Posted in SQLServerPedia Syndication | 3 Comments

Data Warehouse Fast Track Reference Guide for SQL Server 2016

I had previously blogged about the Data Warehouse Fast Track for SQL Server 2016, a joint effort between Microsoft and its hardware partners to deliver validated, pre-configured solutions that reduce the complexity of implementing a data warehouse on SQL Server Enterprise Edition.

Now available are two new excellent white papers to give you a better understanding of Fast Track for SQL Server 2016:

Introducing Microsoft Data Warehouse Fast Track for SQL Server 2016

Data Warehouse Fast Track Reference Guide for SQL Server 2016

Posted in Data warehouse, Fast Track, SQLServerPedia Syndication | 2 Comments

SSAS high availability

If you are looking at providing high availability (HA) for SSAS, here are 3 options:

  1. Install SSAS on a Windows Server Failover Cluster (WSFC)
    Here’s a good article. The main issue with this option is that SSAS isn’t cluster-aware, so if windows is “OK” but SSAS (the service) is hung, it won’t failover.  Also check out How to Cluster SQL Server Analysis Services
  2. Network Load Balancing (NLB) across a SSAS Scale-Out Query Cluster
    Basically just load balancing queries across 1+N servers that are all hosting a separate copy of the tabular or multidimensional model.  If a query-server goes down, there’s still several others available to resolve the query.  This provides scalability and availability.  Unfortunately, it is not completely transparent as you have to manage:
    – Configuration of the load balancer
    – Deployment of updates (for the analysis services databases you can do detach->file copy to other servers->reattach, analysis services database backup/restore, or process data on a “process” server and use database synchronization to update the read-only instances you put behind the load balancer).  If you have to have 24×7 availability, you have to put a node offline when a node is already synchronized, otherwise you accept having different versions of the same database available at the same moment
  3. Azure Analysis Services
    This new service in the cloud has options for high availability.  It makes it super easy to (programmatically) spin up another server and restore a backup.  Just keep in mind the new server does not have the same address, so you have to manage the client connection and this is not transparent.  Also note this service has a 99.9% SLA

On a side note, SSAS can use a SQL Server database in an Always On availability group as a data source: Analysis Services with Always On Availability Groups.

Posted in SQLServerPedia Syndication, SSAS | 5 Comments

SQL Data Warehouse reference architectures

With so many product options to choose from for building a big data solution in the cloud, such as SQL Data Warehouse (SQL DW), Azure Analysis Services (AAS), SQL Database (SQL DB), and Azure Data Lake (ADL), there are various combinations of using the products, each with pros/cons along with differences in cost.  With many customers looking at using SQL DW, I wanted to mention various reference architectures that I have seen, ordered by most cost to lowest cost:

  1. Do staging, data refinement and reporting all from SQL DW.  You can scale compute power up when needed (i.e. during staging, data refinement, or large number of users doing reporting) or down to save costs (i.e. nights and weekends when user reporting is low).  The pros of this option are by reducing the number of technologies you are building a simpler solution and reducing the number of copies of the data.  The cons are since everything is done on SQL DW you can have performance issues (i.e. doing data refinement while users are reporting), can hit the SQL DW concurrent query limit, and can have a higher cost since SQL DW is the highest-cost product, especially if you are not able to pause it.  Pausing it reduces your cost to zero for compute, only having to pay for storage (see Azure SQL Data Warehouse pricing), but no one can use SQL DW when paused
  2. Do staging and data refinement in a Hadoop data lake, and then copy all or part of the data to SQL DW to do reporting.  This saves costs in SQL DW by offloading data refinement, and gives you the benefit of using a data lake (see What is a data lake? and Why use a data lake?).  You save costs by not having to scale up SQL DW to do the data refinement (scaling up would minimize affecting reporting performance and refine data quicker) and by not having to store as much data in SQL DW.  You also save costs by archiving data in the data lake and using PolyBase to access it (be aware these queries could be slow as PolyBase does not support query pushdown in SQL DW).  A con of this architecture is having an extra copy of the data along with the extra ETL needed
  3. Do staging and data refinement in SQL DW, and copy some or all data to one or more data marts (in SQL DB or SQL Server in a VM) and/or one or more cubes (in AAS or SSAS in a VM) for reporting, which is considered a “Hub-and-Spoke” model.  Scale down SQL DW after data refinement and use it for a limited amount of big queries.  This overcomes the SQL DW concurrent query limit by having users query the data mart/cube and saves costs by querying less expensive options.  You also get the benefits that come with a cube such as creating a semantic layer and row-level security that is not available in SQL DW (see Why use a SSAS cube?).  This architecture can also be combined with the previous architecture to add in a data lake.  A con of this architecture is having extra copies of the data along with the extra ETL needed
  4. Do staging and data refinement in SQL DW, and copy all data to a data mart (SQL DB or SQL Server in a VM) and/or a cube (AAS or SSAS in a VM) for reporting.  Pause SQL DW after the staging and data refinement is done.  This is used when giving users access to SQL DW will impact ELT and/or user queries wouldn’t be as responsive as needed, or when cost is a top priority (you only pay for storage costs when SQL DW is paused).  A con of this architecture is having extra copies of the data along with the extra ETL needed, and not having SQL DW available for big queries

More info:

Using SQL Server to Build a Hub-and-Spoke Enterprise Data Warehouse Architecture

Hub-And-Spoke: Building an EDW with SQL Server and Strategies of Implementation

Posted in Azure SQL DW, Data Lake, Data warehouse, SQLServerPedia Syndication | 6 Comments

Microsoft Build event announcements

Another Microsoft event and another bunch of exciting announcements.  At the Microsoft Build event this week, the major announcements in the data platform space were:

Azure Cosmos DB

Azure Cosmos DB is the next big leap in the evolution of DocumentDB.  Cosmos DB is Microsoft’s globally-distributed, horizontally scalable, multi-model database service.  It’s mission is to enable you to write highly scalable, globally distributed apps, easily.  With its turnkey support for global distribution, Azure Cosmos DB seamlessly makes your data available close to where your users are, anywhere around the world; it offers guaranteed low latency, well-defined consistency and high availability around the globe.  It allows you to elastically scale throughput and storage anywhere in the world, based on your needs, and offers a multitude of well-defined consistency models, data models and APIs – so you can select the right ones for your app.

To clear things up, it’s not a “new” product, but rather a renaming of DocumentDB with some additional new features.  Microsoft has transitioned all existing DocumentDB customers and their data to Azure Cosmos DB for no additional charge.  It now natively supports four multiple data models: key-value (new), documents, graphs (new), and columnar.  It also supports many APIs for accessing data including MongoDB and DocumentDB SQL for document model support, Gremlin (preview) for graph model support, and Azure Tables (preview) for key-value model support.  Since it now supports more than just the document model, it would not of made sense to keep the name as DocumentDB, hence the new name.

Microsoft also announced a new consistency level, Consistent Prefix, so that replicas can only move forward in time, as opposed to converging forward in time.  This brings it to a total of five consistency levels developers can use to help unblock programming challenges and binary tradeoffs to better navigate the CAP theorem.  Also introduced are some major improvements to the query engine, which manifests itself as a 50-400% Request Units (RU) reduction per query.

All Azure Table storage accounts will be automatically upgraded to Azure Cosmos DB accounts too, and gain these great new capabilities – including global distribution, automatic indexing, dedicated throughput, and low latency.

For more info, see Azure Cosmos DB: The industry’s first globally-distributed, multi-model database service, A technical overview of Azure Cosmos DBWelcome to Azure Cosmos DB.

Azure Database Migration Service (DMS)

Microsoft announced (See Azure Database Migration Service announcement at //build) a limited preview of the Azure Database Migration Service which will streamline the process for migrating on-premises databases to Azure.

Using this new database migration service simplifies the migration of existing on-premises SQL Server, Oracle, and MySQL databases to Azure, whether your target database is Azure SQL Database, Azure SQL Database Managed Instance or Microsoft SQL Server in an Azure virtual machine.

The automated workflow with assessment reporting, guides you through the necessary changes prior to performing the migration.  When you are ready, the service will migrate the source database to Azure.  For an opportunity to participate in the limited preview of this service, sign up.

Think of this as similar to the SQL Server Migration Assistant (SSMA), except this is an Azure PaaS so there is no VMs to create or software to install.

Azure Database for MySQL and PostgreSQL

Microsoft announced (See Microsoft extends Azure managed database services with introduction of MySQL and PostgreSQL) the preview of managed database services with Azure Database for MySQL and Azure Database for PostgreSQL.

These services are built on the intelligent, trusted and flexible Azure relational database platform. This platform extends similar managed services benefits, global Azure region reach, and innovations that currently power Azure SQL Database and Azure SQL Data Warehouse services to the MySQL and PostgreSQL database engines. Starting at preview, customers can use the service to build and deploy their applications using MySQL version 5.6/5.7 and PostgreSQL version 9.5/9.6 in 11 regions across US, Europe, Asia and Japan.

To get started at Azure Database for MySQL and Azure Database for PostgreSQL.

More info:

Microsoft’s New Azure Database Offerings Challenge (and Maybe Surpass) AWS Cloud

Inside Microsoft’s Cosmos DB

Posted in SQLServerPedia Syndication | 2 Comments

Power BI Premium, Report Server, Apps and API

Announced today are some really cool new Power BI features:

Power BI Premium

Previously available were two tiers, Power BI Free and Power BI Pro ($10/user/month).  The problem with Power BI Pro is that for large organizations, this can add up.  In addition, their performance needs might not be met.  Power BI Premium, which is an add-on to Power BI Pro, addresses the concern about cost and scale.

For costs, it allows an unlimited number of users since it is priced by aggregate capacity (see Power BI Premium calculator).  Users who need to create content in Power BI will still require a $10/month Power BI Pro seat, but there is no per-seat charge for consumption.

For scale, it runs on dedicated hardware giving capacity exclusively allocated to an organization for increased performance (no noisy neighbors).  Organizations can choose to apply their dedicated capacity broadly, or allocate it to assigned workspaces based on the number of users, workload needs or other factors—and scale up or down as requirements change.

There will be changes to the Power BI’s free tier.  Users of the free tier will now be able to connect to all of the data sources that Pro users can connect to, including those available through the on-premises data gateway, and their storage quota will increase from 1GB to 10GB.  The data refresh maximum increases from once daily to 8 per day (hourly-based schedule), and streaming data rates increase from ten thousand rows per hour to one million rows per hour.

For Power BI Premium, you get 100TB of storage and a data refresh maximum of 48 per day (minute-based schedule that can be refreshed every 30 minutes).

Upcoming features for Power BI Premium include the ability to incrementally refresh the data so that only the newest data from the last day (or hour) is loaded into Power BI, pinning datasets to memory, dedicated data refresh nodes, read-only replicas, and geographic distribution (see Microsoft Power BI Premium Whitepaper for more info).  Also, the dataset size cached limit will eventually be removed (it is 1GB in Power BI Pro), so you will be able to build models as large as the Power BI Premium dedicated capacity memory can hold (currently 50GB).

Users of free tier will no longer be able to share their reports and dashboards with other users.  Peer-to-peer dashboard sharing, group workspaces (now called app workspaces), export to PowerPoint, export to CSV/Excel, and analyze in Excel with Power BI apps are capabilities limited to Power BI Pro.  The rationale for this is that if the scope of a user’s needs are limited to personal use, then no fees should apply, but if the user wishes to share or collaborate with others, those are capabilities that need to be paid for.  For existing users of the free service who have been active within the past year, Microsoft is offering a free, 12-month extended trial of Power BI Pro (more info).

Changes to the Power BI free tier become effective on June 1st.  Power BI premium will be available sometime in the 2nd quarter of calendar 2017, meaning by the end of June.

Power BI Report Server

This is an on-premise deployment option that is part of Power BI Premium.  You use Power BI Desktop to author reports that you can then deploy to Power BI Report Server, keeping everything on-prem.  Power BI Report Server is actually just a superset of SQL Server Reporting Services (SSRS) and includes all Reporting Services capabilities, including operational (RDL) reports.  It delivers the capabilities made available in January 2017 as the Technical Preview for Power BI Reports in SSRS.  If you are a per-core licensee of SQL Server Enterprise Edition and have Software Assurance you will be able to host Power BI reports on-premises without a Power BI Premium subscription.  So organizations can choose Power BI in the cloud, or elect to keep reports on-premises with Power BI Report Server and move to the cloud at their pace.

Power BI Apps

Microsoft is evolving content packs into Power BI apps to improve how users discover and explore insights at enterprise scale.  Available today in preview as part of Power BI Premium, Power BI apps offer a simplified way of deploying and managing a collection of purpose-built dashboards and reports to specific people, groups or an entire organization.  Business users can easily install these apps and navigate them with ease, centralizing content in one place and updating automatically.  This differs from contact packs as once installed, content packs lose their grouped identity.  The end users just see a list of dashboards and reports.  Apps, on the other hand, maintain their grouping and identity even after installation.  This makes it very easy for end users to navigate content over time.  For more info, check out Distribute to large audiences with Power BI apps.

Power BI API

Going away will be Power BI Embedded, which has its own API, somewhat distinct from the Power BI service.  Replacing it is a Power BI API that converges Power BI Embedded with the Power BI service to deliver one API surface.  Existing apps built on Power BI Embedded will continue to be supported.  See How to migrate Power BI Embedded workspace collection content to Power BI.

Conclusion

These changes give converge to three scenarios: personal (with Power BI Desktop and Power BI free), departmental (with Power BI Pro) and Enterprise (with Power BI Premium).  And the beauty is the same two products, Power BI Desktop and the Power BI Service, are used for all three scenarios.

More info:

Microsoft’s Power BI Premium delivers enterprise-grade features and bulk discounts

Introducing Power BI Report Server for on-premises Power BI report publishing

May 3 announcement FAQ

What Does Power BI Premium Mean for You?

On-Premise Power BI VOL. 2

To PowerBI Premium, Or Not To PowerBI Premium…

Power BI Licences Changes–The Good, The Bad and The Why

Power BI Premium. Is It For You or Not?

A closer look at Power BI Report Server

Posted in Power BI, SQLServerPedia Syndication | 15 Comments

Microsoft Data Amp event announcements

Yesterday was the Microsoft Data Amp event where a bunch of very exciting announcements were made:

  • SQL Server vNext CTP 2.0 is now available and the product will be officially called SQL Server 2017:
  • SQL Server R Services in SQL Server 2017 is renamed to Machine Learning Services since both R and Python will be supported.  More info
  • Three new features for Cognitive Services are now Generally Available (GA): Face API, Content Moderator, Computer Vision API.  More info
  • Microsoft R Server 9.1 released: Real time scoring and performance enhancements, Microsoft ML libraries for Linux, Hadoop/Spark and Teradata.  More info
  • Azure Analysis Services is now Generally Available (GA).  More info
  • Microsoft has incorporated the technology that sits behind the Cognitive Services inside U-SQL directly as functions.  U-SQL is part of Azure Data Lake Analytics (ADLA)
  • More Cortana Intelligence solution templates: Demand forecasting, Personalized offers, Quality assurance.  More info
  • A new database migration service will help you migrate existing on-premises SQL Server, Oracle, and MySQL databases to Azure SQL Database or SQL Server on Azure virtual machines.  Sign up for limited preview
  • A new Azure SQL Database offering, currently being called Azure SQL Managed Instance (final name to be determined):
    • Migrate SQL Server to SQL as a Service with no changes
    • Support SQL Agent, 3-part names, DBMail, CDC, Service Broker
    • Cross-database + cross-instance querying
    • Extensibility: CLR + R Services
    • SQL profiler, additional DMVs support, Xevents
    • Native back-up restore, log shipping, transaction replication
    • More info
    • Sign up for limited preview

More info:

Delivering AI with data: the next generation of Microsoft’s data platform

Posted in Azure, Azure SQL Database, Cortana Intelligence Suite, SQLServerPedia Syndication | 1 Comment

Using Azure for free

There are a number of options for using Azure for free.  This is very useful for those of you who are not familiar with Azure and want to “play” with it:

Posted in Azure, SQLServerPedia Syndication | 1 Comment

Artificial Intelligence defined

The new buzzword in the industry is “Artificial Intelligence” (AI).  But exactly what is AI and how does it compare to “machine learning” and “deep learning”?  The best definitions I have seen come from the excellent article “Why Deep Learning is Suddenly Changing Your Life”:

AI Terms

So, AI is really an umbrella name and underneath it is categories like machine learning and deep learning.  In the Microsoft world, products that fit into the machine learning include Azure ML and Microsoft R Server.  In the deep learning category are products like speech recognition which is part of Microsoft Cognitive Services and includes other products like face recognition and language translator.  Also available is the Microsoft Cognitive Toolkit (aka CNTK) which is a free, easy-to-use, open-source, commercial-grade toolkit that trains deep learning algorithms to learn like the human brain.

Be aware this is just one of the industry taxonomies, as others would group machine learning under advanced analytics, and deep learning under AI, with the idea that technically, deep learning is simply an improved convolutional neural network method as opposed to something beyond machine learning (as the hype train is driving).

More info:

What’s the Difference Between Artificial Intelligence, Machine Learning, and Deep Learning?

Posted in SQLServerPedia Syndication | 2 Comments