Category Archives: Data Lake

Microsoft Connect(); announcements

Microsoft Connect(); is a developer event from Nov 16-18, where plenty of announcements are made.  Here is a summary of the data platform related announcements: Azure Data Lake Analytics is generally available in two datacenters in US (EMEA coming in Feb). … Continue reading

Posted in Data Lake, HDInsight, SQL Server, SQLServerPedia Syndication | Comments Off on Microsoft Connect(); announcements

Why use a data lake?

Previously I covered what a data lake is (including the Azure Data Lake and enhancements), and now I wanted to touch on the main reason why you might want to incorporate a data lake into your overall data warehouse solution. To … Continue reading

Posted in Azure SQL DW, Data Lake, SQLServerPedia Syndication | 13 Comments

Storage options on Azure

Microsoft Azure is a cloud computing platform and infrastructure, created by Microsoft, for building, deploying and managing applications and services through a global network of Microsoft-managed and Microsoft partner-hosted datacenters.  Included in this platform are multiple ways of storing data.  Below … Continue reading

Posted in Azure, Data Lake, SQLServerPedia Syndication | 1 Comment

Copying data from Azure Blob Storage

In a previous blog I talked about copying on-prem data to Azure Blob Storage (Getting data into Azure Blob Storage).  Let’s say you have copied the data and it is sitting in Azure Blob Storage (or an Azure Data Lake) … Continue reading

Posted in Azure, Azure SQL Database, Azure SQL DW, Data Lake, PolyBase, SQLServerPedia Syndication | Comments Off on Copying data from Azure Blob Storage

Azure Data Lake enhancements

I first blogged about Microsoft’s new product, the Azure Data Lake, a few months back (here).  There are already enhancements, as announced at Stata + Hadoop World.  Here they are in brief: The Azure Data Lake has been renamed to the … Continue reading

Posted in Azure, Data Lake, HDInsight, SQLServerPedia Syndication | 2 Comments

Getting data into Azure Blob Storage

If you have on-prem data and want to copy it to Azure Blob Storage in the cloud, what are all the possible ways to do it?  There are many, and here is a quick review of them: AzCopy: A popular command-line … Continue reading

Posted in Azure, Azure SQL DW, Data Lake, PolyBase, SQLServerPedia Syndication | 2 Comments

Azure Data Lake

At the recent Microsoft Build Developer Conference, Executive Vice President Scott Guthrie announced the Azure Data Lake.  It is a new flavor of Azure Storage which can handle streaming data (low latency, high volume, short updates), is geo-distributed, data-locality aware and … Continue reading

Posted in Big Data, Data Lake, SQLServerPedia Syndication | 11 Comments

What is a data lake?

A “data lake” is a storage repository, usually in Hadoop, that holds a vast amount of raw data in its native format until it is needed.  It’s a great place for investigating, exploring, experimenting, and refining data, in addition to archiving … Continue reading

Posted in Data Lake, Data warehouse, Hadoop, SQLServerPedia Syndication | 11 Comments

The Modern Data Warehouse

The traditional data warehouse has served us well for many years, but new trends are causing it to break in four different ways: data growth, fast query expectations from users, non-relational/unstructured data, and cloud-born data.  How can you prevent this … Continue reading

Posted in Big Data, Data Lake, Data warehouse, Hadoop, PDW/APS, SQLServerPedia Syndication | 8 Comments

Hadoop and Data Warehouses

I see a lot of confusion when it comes to Hadoop and its role in a data warehouse solution.  Hadoop should not be a replacement for a data warehouse, but rather should augment/complement a data warehouse.  Hadoop and a data warehouse … Continue reading

Posted in Data Lake, Data warehouse, Hadoop, PDW/APS, PolyBase, SQLServerPedia Syndication | 5 Comments