Microsoft Build event announcements

Another Microsoft event and another bunch of exciting announcements.  At the Microsoft Build event this week, the major announcements in the data platform space were:

Multi-master at global scale with Azure Cosmos DB.  Perform writes on containers of data (for example, collections, graphs, tables) distributed anywhere in the world. You can update data in any region that is associated with your database account. These data updates can propagate asynchronously. In addition to providing fast access and write latency to your data, multi-master also provides a practical solution for failover and load-balancing issues.  More info

Azure Cosmos DB Provision throughput at the database level in preview.  Azure Cosmos DB customers with multiple collections can now provision throughput at a database level and share throughput across the database, making large collection databases cheaper to start and operate.  More info

Virtual network service endpoint for Azure Cosmos DB.  Generally available today, virtual network service endpoint (VNET) helps to ensure access to Azure Cosmos DB from the preferred virtual network subnet.  The feature will remove the manual change of IP and provide an easier way to manage access to Azure Cosmos DB endpoint.  More info

Azure Cognitive Search now in preview.  Cognitive Search, a new preview feature in the existing Azure Search service, includes an enrichment pipeline allowing customers to find rich structured information from documents.  That information can then become part of the Azure Search index.  Cognitive Search also integrates with Natural Language Processing capabilities and includes built-in enrichers called cognitive skills.  Built-in skills help to perform a variety of enrichment tasks, such as the extraction of entities from text or image analysis and OCR capabilities.  Cognitive Search is also extensible and can connect to your own custom-built skills.  More info

Azure SQL Database and Data Warehouse TDE with customer managed keys.  Now generally available, Azure SQL Database and Data Warehouse Transparent Data Encryption (TDE) offers Bring Your Own Key (BYOK) support with Azure Key Vault integration.  Azure Key Vault provides highly available and scalable secure storage for RSA cryptographic keys backed by FIPS 140-2 Level 2 validated Hardware Security Modules (HSMs).  Key Vault streamlines the key management process and enables customers to maintain full control of encryption keys and allows them to manage and audit key access.  This is one of the most frequently requested features by enterprise customers looking to protect sensitive data and meet regulatory or security compliance obligations.  More info

Azure Database Migration Service is now generally available.  This is a service that was designed to be a seamless, end-to-end solution for moving on-premises SQL Server, Oracle, and other relational databases to the cloud. The service will support migrations of homogeneous/heterogeneous source-target pairs, and the guided migration process will be easy to understand and implement.  More info

4 new features now available in Azure Stream Analytics: Public preview: Session window; Private preview: C# custom code support for Stream Analytics jobs on IoT Edge, Blob output partitioning by custom attribute, Updated Built-In ML models for Anomaly Detection.  More info

Posted in SQLServerPedia Syndication | 1 Comment

Azure SQL Data Warehouse Gen2 announced

Monday was announced the general availability of the Compute Optimized Gen2 tier of Azure SQL Data Warehouse.  With this performance optimized tier, Microsoft is dramatically accelerating query performance and concurrency.

The changes in Azure SQL DW Compute Optimized Gen2 tier are:

  • 5x query performance via a adaptive caching technology. which takes a blended approach of using remote storage in combination with a fast SSD cache layer (using NVMes) that places data next to compute based on user access patterns and frequency
  • Significant improvement in serving concurrent queries (32 to 128 queries/cluster)
  • Removes limits on columnar data volume to enable unlimited columnar data volume
  • 5 times higher computing power compared to the current generation by leveraging the latest hardware innovations that Azure offers via additional Service Level Objectives (DW7500c, DW10000c, DW15000c and DW30000c)
  • Added Transparent Data Encryption with customer-managed keys

Azure SQL DW Compute Optimized Gen2 tier will roll out to 20 regions initially, you can find the full list of regions available, with subsequent rollouts to all other Azure regions.  If you have a Gen1 data warehouse, take advantage of the latest generation of the service by upgrading.  If you are getting started, try Azure SQL DW Compute Optimized Gen2 tier today.

More info:

Turbocharge cloud analytics with Azure SQL Data Warehouse

Blazing fast data warehousing with Azure SQL Data Warehouse

Video Microsoft Mechanics video

Posted in Azure SQL DW, SQLServerPedia Syndication | 1 Comment

Podcast: Big Data Solutions in the Cloud

In this podcast I talk with Carlos Chacon of SQL Data Partners on big data solutions in the cloud.  Here is the description of the chat:

Big Data.  Do you have big data?  What does that even mean?  In this episode I explore some of the concepts of how organizations can manage their data and what questions you might need to ask before you implement the latest and greatest tool.  I am joined by James Serra, Microsoft Cloud Architect, to get his thoughts on implementing cloud solutions, where they can contribute, and why you might not be able to go all cloud.  I am interested to see if more traditional DBAs move toward architecture roles and help their organizations manage the various types of data.  What types of issues are giving you troubles as you adopt a more diverse data ecosystem?

I hope you give it a listen!

Posted in Podcast, SQLServerPedia Syndication | Comments Off on Podcast: Big Data Solutions in the Cloud

Cost savings of the cloud

I often hear people say moving to the cloud does not save money, but frequently they don’t take into account the savings for indirect costs that are hard to measure (or the benefits you get that are simply not cost-related).  For example, the cloud allows you to get started in building a solution in a matter of minutes while starting a solution on-prem can take weeks or even months.  How do you put a monetary figure on that?  Or these other benefits that are difficult to put a dollar figure on:

  • Unlimited storage
  • Grow hardware as demand is needed (unlimited elastic scale) and even pause (and not pay anything)
  • Upgrade hardware instantly compared to weeks/months to upgrade on-prem
  • Enhanced availability and reliability (i.e. data in Azure automatically has three copies). What does each hour of downtime cost your business?
  • Benefit of having separation of compute and storage so don’t need to upgrade one when you only need to upgrade the other
  • Pay for only what you need (Reduce hardware as demand lessons)
  • Not having to guess how much hardware you need and getting too much or too little
  • Getting hardware solely based on the max peak
  • Ability to fail fast (cancel a project and not have to hardware left over)
  • Really helpful for proof-of-concept (POC) or development projects with a known lifespan because you don’t have to re-purpose hardware afterwards
  • The value of being able to incorporate more data allowing more insights into your business
  • No commitment or long-term vendor lock
  • Benefit from changes in the technology impacting the latest storage solutions
  • More frequent updates to OS, sql server, etc
  • Automatic software updates
  • The cloud vendors have much higher security than anything on-prem.  You can imagine the loss of income if a vendor had a security breach, so the investment in keeping things secure is massive

As you can see, there is much more than just running numbers in an Excel spreadsheet to see how much money the cloud will save you.  But if you really needed that, Microsoft has a Total Cost of Ownership (TCO) Calculator that will estimate the cost savings you can realize by migrating your application workloads to Microsoft Azure.  You simply provide a brief description of your on-premises environment to get an instant report.

The benefits that are easier to put a dollar figure on:

  • Don’t need co-location space, so cost savings (space, power, networking, etc)
  • No need to manage the hardware infrastructure, reducing staff
  • No up-front hardware costs or costs for hardware refresh cycles every 3-5 years
  • High availability and disaster recovery done for you
  • Automatic geography redundancy
  • Having built-in tools (i.e. monitoring) so you don’t need to purchase 3rd-party software

Also, there are some constraints of on-premise data that go away when moving to the cloud:

  • Scale constrained to on-premise procurement
  • Yearly operating expense (OpEx) instead of CapEx up-front costs
  • A staff of employees or consultants administering and supporting the hardware and software in place
  • Expertise needed for tuning and deployment

I often tell clients that if you have your own on-premise data center, you are in the air conditioning business.  Wouldn’t you rather focus all your efforts on analyzing data?  You could also try to “save money” by doing your own accounting, but wouldn’t it make more sense to off-load that to an accounting company?  Why not also off-load the  costly, up-front investment of hardware, software, and other infrastructure, and the costs of maintaining, updating, and securing an on-premises system?

And when dealing with my favorite topic, data warehousing, a conventional on-premise data warehouse can cost millions of dollars in the following: licensing fees, hardware, and services; the time and expertise required to set up, manage, deploy, and tune the warehouse; and the costs to secure and back up the data.  All items that a cloud solution eliminates or greatly minimizes.

When estimating hardware costs for a data warehouse, consider the costs of servers, additional storage devices, firewalls, networking switches, data center space to house the hardware, a high-speed network (with redundancy) to access the data, and the power and redundant power supplies needed to keep the system up and running.  If your warehouse is mission critical then you need to also add the costs to configure a disaster recovery site, effectively doubling the cost.

When estimating software costs for a data warehouse, organizations frequently pay hundreds of thousands of dollars in software licensing fees for data warehouse software and add-on packages.  Also think about additional end users that are given access to the data warehouse, such as customers and suppliers, can significantly increase those costs.  Finally, add the ongoing cost for annual support contracts, which often comprise 20 percent of the original license cost.

Also note that an on-premises data warehouse needs specialized IT personnel to deploy and maintain the system.  This creates a potential bottleneck when issues arise and keeps responsibility for the system with the customer, not the vendor.

I’ll point out my two key favorite advantages of having a data warehousing solution in the cloud:

  • The complexities and cost of capacity planning and administration such as sizing, balancing, and tuning the system, are built into the system, automated, and covered by the cost of your subscription
  • By able to dynamically provision storage and compute resources on the fly to meet the demands of your changing workloads in peak and steady usage periods.  Capacity is whatever you need whenever you need it

Hopefully this blog post points out that while there can be considerable costs savings in moving to the cloud, there are so many other benefits that cost should not be the only reason to move.

More info:

How To Measure the ROI of Moving To the Cloud

Cloud migration – where are the savings?

Comparing cloud vs on-premise? Six hidden costs people always forget about

The high cost and risk of On-Premise vs. Cloud

TCO Analysis Demonstrates How Moving To The Cloud Can Save Your Company Money

5 Common Assumptions Comparing Cloud To On-Premises

5 Financial Benefits of Moving to the Cloud

IT Execs Say Cost Savings Make Cloud-Based Analytics ‘Inevitable’

Posted in Azure, SQLServerPedia Syndication | 3 Comments

Podcast: Myths of Modern Data Management

As part of the Secrets of Data Analytics Leaders by the Eckerson Group, I did a 30-minute podcast with Wayne Eckerson where I discussed myths of modern data management.  Some of the myths discussed include ‘all you need is a data lake’, ‘the data warehouse is dead’, ‘we don’t need OLAP cubes anymore’, ‘cloud is too expensive and latency is too slow’, ‘you should always use a NoSQL product over a RDBMS.’  I hope you check it out!

Posted in Data Lake, Data warehouse, Podcast, SQLServerPedia Syndication | 1 Comment

Webinar: Is the traditional data warehouse dead?

As a follow-up to my blog Is the traditional data warehouse dead?, I did a webinar on that very topic for the Agile Big Data Processing Summit.  The deck is here and the webinar is below:


Posted in Data warehouse, Podcast, Presentation, SQLServerPedia Syndication | 1 Comment

Is the traditional data warehouse dead? webinar

As a follow-up to my blog Is the traditional data warehouse dead?, I will be doing a webinar on that very topic tomorrow (March 27th) at 11am EST for the Agile Big Data Processing Summit that I hope you can join.  Details can be found here.  The abstract is:

Is the traditional data warehouse dead?

With new technologies such as Hive LLAP or Spark SQL, do you still need a data warehouse or can you just put everything in a data lake and report off of that? No! In the presentation, James will discuss why you still need a relational data warehouse and how to use a data lake and an RDBMS data warehouse to get the best of both worlds.

James will go into detail on the characteristics of a data lake and its benefits and why you still need data governance tasks in a data lake. He’ll also discuss using Hadoop as the data lake, data virtualization, and the need for OLAP in a big data solution, and he will put it all together by showing common big data architectures.

Posted in Data warehouse, Podcast, Presentation | 1 Comment

Public preview of Azure SQL Database Managed Instance

Microsoft has announced the public preview of Azure SQL Database Managed Instance.  I blogged about this before.  This will lead to a title wave of on-prem SQL Server database migrations to the cloud.  In summary:

Managed Instance is an expansion of the existing SQL Database service, providing a third deployment option alongside single databases and elastic pools. It is designed to enable database lift-and-shift to a fully-managed service, without re-designing the application.  SQL Database Managed Instance provides the broadest SQL Server engine compatibility and native virtual network (VNET) support so you can migrate your SQL Server databases to SQL Database without changing your apps.  It combines the rich SQL Server surface area with the operational and financial benefits of an intelligent, fully-managed service.

Two other related items that are available:

  • Azure Hybrid Benefit for SQL Server on Azure SQL Database Managed Instance. The Azure Hybrid Benefit for SQL Server is an Azure-based benefit that enables customers to use their SQL Server licenses with Software Assurance to save up to 30% on SQL Database Managed Instance. Exclusive to Azure, the hybrid benefit will provide an additional benefit for highly-virtualized Enterprise Edition workloads with active Software Assurance: for every 1 core a customer owns on-premises, they will receive 4 vCores of Managed Instance General Purpose. This makes moving virtualized applications to Managed Instance highly cost-effective.
  • Database Migration Services for Azure SQL Database Managed Instance. Using the fully-automated Database Migration Service (DMS) in Azure, customers can easily lift and shift their on-premises SQL Server databases to a SQL Database Managed Instance. DMS is a fully managed, first party Azure service that enables seamless and frictionless migrations from heterogeneous database sources to Azure Database platforms with minimal downtime. It will provide customers with assessment reports that guide them through the changes required prior to performing a migration. When the customer is ready, the DMS will perform all the steps associated with the migration process.

More info:

Migrate your databases to a fully managed service with Azure SQL Database Managed Instance

What is Azure SQL Database Managed Instance?

Video Introducing Azure SQL Database Managed Instance

Azure SQL Database Managed Instance – the Good, the Bad, the Ugly

Posted in Azure SQL Database, SQLServerPedia Syndication | 5 Comments

It’s all about the use cases

There is no better way to see the art of the possible with the cloud than in use cases/customer stories and sample solutions/architectures.  Many of these are domain-specific which resonates best with the business decision makers:

Use cases/customer stories

Microsoft IoT customer stories: Explore Internet of Things (IoT) examples and IoT use cases to learn how Microsoft IoT is already transforming your industry.  The industry’s are broken out by: Manufacturing, Smart Infrastructure, Transportation, Retail, and Healthcare.

Customer stories: Dozens of customer stories of solutions built in Azure that you can filter on by language, industry, product, organization size, and region.

Case studies: See the amazing things people are doing with Azure broken out by industry, product, solution, and customer location.

Sample solutions/architectures

Azure solution architectures: These architectures to help you design and implement secure, highly-available, performant and resilient solutions on Azure.

Pre-configured AI solutions: These serve as a great starting point when building an AI solution.  Broken out by Retail, Manufacturing, Banking, and Healthcare.

Internet of Things (IoT) solutions: Great IoT sample solutions such as: connected factory, remote monitoring, predictive maintenance, connected field service, connected vehicle, and smart buildings.

Posted in Azure, SQLServerPedia Syndication | 1 Comment

My latest presentations

I frequently present at user groups, and always try to create a brand new presentation to keep things interesting.  We all know technology changes so quickly so there is no shortage of topics!  There is a list of all my presentations with slide decks.  Here are the new presentations I created the past year:

Differentiate Big Data vs Data Warehouse use cases for a cloud solution

It can be quite challenging keeping up with the frequent updates to the Microsoft products and understanding all their use cases and how all the products fit together.  In this session we will differentiate the use cases for each of the Microsoft services, explaining and demonstrating what is good and what isn’t, in order for you to position, design and deliver the proper adoption use cases for each with your customers.  We will cover a wide range of products such as Databricks, SQL Data Warehouse, HDInsight, Azure Data Lake Analytics, Azure Data Lake Store, Blob storage, and AAS  as well as high-level concepts such as when to use a data lake.  We will also review the most common reference architectures (“patterns”) witnessed in customer adoption. (slides)

Introduction to Azure Databricks

Databricks is a Software-as-a-Service-like experience (or Spark-as-a-service) that is a tool for curating and processing massive amounts of data and developing, training and deploying models on that data, and managing the whole workflow process throughout the project.  It is for those who are comfortable with Apache Spark as it is 100% based on Spark and is extensible with support for Scala, Java, R, and Python alongside Spark SQL, GraphX, Streaming and Machine Learning Library (Mllib).  It has built-in integration with many data sources, has a workflow scheduler, allows for real-time workspace collaboration, and has performance improvements over traditional Apache Spark. (slides)

Azure SQL Database Managed Instance

Azure SQL Database Managed Instance is a new flavor of Azure SQL Database that is a game changer.  It offers near-complete SQL Server compatibility and network isolation to easily lift and shift databases to Azure (you can literally backup an on-premise database and restore it into a Azure SQL Database Managed Instance).  Think of it as an enhancement to Azure SQL Database that is built on the same PaaS infrastructure and maintains all it’s features (i.e. active geo-replication, high availability, automatic backups, database advisor, threat detection, intelligent insights, vulnerability assessment, etc) but adds support for databases up to 35TB, VNET, SQL Agent, cross-database querying, replication, etc.  So, you can migrate your databases from on-prem to Azure with very little migration effort which is a big improvement from the current Singleton or Elastic Pool flavors which can require substantial changes. (slides)

What’s new in SQL Server 2017

Covers all the new features in SQL Server 2017, as well as details on upgrading and migrating to SQL Server 2017 or to Azure SQL Database. (slides)

Microsoft Data Platform – What’s included

The pace of Microsoft product innovation is so fast that even though I spend half my days learning, I struggle to keep up. And as I work with customers I find they are often in the dark about many of the products that we have since they are focused on just keeping what they have running and putting out fires. So, let me cover what products you might have missed in the Microsoft data platform world. Be prepared to discover all the various Microsoft technologies and products for collecting data, transforming it, storing it, and visualizing it.  My goal is to help you not only understand each product but understand how they all fit together and there proper use case, allowing you to build the appropriate solution that can incorporate any data in the future no matter the size, frequency, or type. Along the way we will touch on technologies covering NoSQL, Hadoop, and open source. (slides)

Learning to present and becoming good at it

Have you been thinking about presenting at a user group?  Are you being asked to present at your work?  Is learning to present one of the keys to advancing your career?  Or do you just think it would be fun to present but you are too nervous to try it?  Well take the first step to becoming a presenter by attending this session and I will guide you through the process of learning to present and becoming good at it.  It’s easier than you think!  I am an introvert and was deathly afraid to speak in public.  Now I love to present and it’s actually my main function in my job at Microsoft.  I’ll share with you journey that lead me to speak at major conferences and the skills I learned along the way to become a good presenter and to get rid of the fear.  You can do it! (slides)

Microsoft cloud big data strategy

Think of big data as all data, no matter what the volume, velocity, or variety.  The simple truth is a traditional on-prem data warehouse will not handle big data.  So what is Microsoft’s strategy for building a big data solution?  And why is it best to have this solution in the cloud?  That is what this presentation will cover.  Be prepared to discover all the various Microsoft technologies and products from collecting data, transforming it, storing it, to visualizing it.  My goal is to help you not only understand each product but understand how they all fit together, so you can be the hero who builds your companies big data solution. (slides)

Choosing technologies for a big data solution in the cloud

Has your company been building data warehouses for years using SQL Server?  And are you now tasked with creating or moving your data warehouse to the cloud and modernizing it to support “Big Data”?  What technologies and tools should use?  That is what this presentation will help you answer.  First we will level-set what big data is and other definitions, cover questions to ask to help decide which technologies to use, go over the new technologies to choose from, and then compare the pros and cons of the technologies.  Finally we will show you common big data architecture solutions and help you to answer questions such as: Where do I store the data?  Should I use a data lake?  Do I still need a cube?  What about Hadoop/NoSQL?  Do I need the power of MPP?  Should I build a “logical data warehouse”?  What is this lambda architecture?  And we’ll close with showing some architectures of real-world customer big data solutions.  Come to this session to get started down the path to making the proper technology choices in moving to the cloud. (slides)

Posted in Presentation, SQLServerPedia Syndication | 3 Comments