Category Archives: Data Lake

Podcast: Myths of Modern Data Management

As part of the Secrets of Data Analytics Leaders by the Eckerson Group, I did a 30-minute podcast with Wayne Eckerson where I discussed myths of modern data management.  Some of the myths discussed include ‘all you need is a data lake’, ‘the … Continue reading

Posted in Data Lake, Data warehouse, Podcast, SQLServerPedia Syndication | 1 Comment

Is the traditional data warehouse dead?

There have been a number of enhancements to Hadoop recently when it comes to fast interactive querying with such products as Hive LLAP and Spark SQL which are being used over slower interactive querying options such as Tez/Yarn and batch … Continue reading

Posted in Data Lake, Data warehouse, SQLServerPedia Syndication | 10 Comments

Data Virtualization vs Data Warehouse

Data virtualization goes by a lot of different names: logical data warehouse, data federation, virtual database, and decentralized data warehouse.  Data virtualization allows you to integrate data from various sources, keeping the data in-place, so that you can generate reports … Continue reading

Posted in Data Lake, Data warehouse, SQLServerPedia Syndication | Comments Off on Data Virtualization vs Data Warehouse

Data lake details

I have blogged before about data lakes (see What is a data lake? and Why use a data lake?), and wanted to provide more details on this popular technology, some of which I cover in my presentation “Big data architectures and the data lake“. … Continue reading

Posted in Azure Data Lake, Data Lake, SQLServerPedia Syndication | 3 Comments

SQL Data Warehouse reference architectures

With so many product options to choose from for building a big data solution in the cloud, such as SQL Data Warehouse (SQL DW), Azure Analysis Services (AAS), SQL Database (SQL DB), and Azure Data Lake (ADL), there are various combinations … Continue reading

Posted in Azure SQL DW, Data Lake, Data warehouse, SQLServerPedia Syndication | 6 Comments

Microsoft Connect(); announcements

Microsoft Connect(); is a developer event from Nov 16-18, where plenty of announcements are made.  Here is a summary of the data platform related announcements: Azure Data Lake Analytics is generally available in two datacenters in US (EMEA coming in Feb). … Continue reading

Posted in Data Lake, HDInsight, SQL Server, SQLServerPedia Syndication | Comments Off on Microsoft Connect(); announcements

Why use a data lake?

Previously I covered what a data lake is (including the Azure Data Lake and enhancements), and now I wanted to touch on the main reason why you might want to incorporate a data lake into your overall data warehouse solution. To … Continue reading

Posted in Azure SQL DW, Data Lake, SQLServerPedia Syndication | 13 Comments

Storage options on Azure

Microsoft Azure is a cloud computing platform and infrastructure, created by Microsoft, for building, deploying and managing applications and services through a global network of Microsoft-managed and Microsoft partner-hosted datacenters.  Included in this platform are multiple ways of storing data.  Below … Continue reading

Posted in Azure, Data Lake, SQLServerPedia Syndication | 1 Comment

Copying data from Azure Blob Storage

In a previous blog I talked about copying on-prem data to Azure Blob Storage (Getting data into Azure Blob Storage).  Let’s say you have copied the data and it is sitting in Azure Blob Storage (or an Azure Data Lake) … Continue reading

Posted in Azure, Azure SQL Database, Azure SQL DW, Data Lake, PolyBase, SQLServerPedia Syndication | Comments Off on Copying data from Azure Blob Storage

Azure Data Lake enhancements

I first blogged about Microsoft’s new product, the Azure Data Lake, a few months back (here).  There are already enhancements, as announced at Stata + Hadoop World.  Here they are in brief: The Azure Data Lake has been renamed to the … Continue reading

Posted in Azure, Data Lake, HDInsight, SQLServerPedia Syndication | 2 Comments