Category Archives: Hadoop

Hadoop and Data Warehouses

I see a lot of confusion when it comes to Hadoop and its role in a data warehouse solution.  Hadoop should not be a replacement for a data warehouse, but rather should augment/complement a data warehouse.  Hadoop and a data warehouse … Continue reading

Posted in Data warehouse, Hadoop, PDW/APS, SQLServerPedia Syndication | 2 Comments

Introduction to Hadoop

Hadoop was created by the Apache foundation as an open-source software framework capable of processing large amounts of heterogeneous data-sets in a distributed fashion (via MapReduce) across clusters of commodity hardware on a storage framework (HDFS).  Hadoop uses a simplified programming model.  The … Continue reading

Posted in Hadoop, PDW/APS, SQLServerPedia Syndication | 7 Comments

What is HDInsight?

There are two flavors of HDInsight: Windows Azure HDInsight Service and Microsoft HDInsight Server for Windows (recently quietly killed but lives on in a different form).  Both were developed in partnership with Hadoop software developer and distributor Hortonworks and were made … Continue reading

Posted in Hadoop, PDW/APS, SQLServerPedia Syndication | 5 Comments

PolyBase explained

PolyBase is a new technology that integrates Microsoft’s MPP product, SQL Server Parallel Data Warehouse (PDW), with Hadoop.  It is designed to enable queries across relational data stored in PDW and in non-relational Hadoop data that is stored in the Hadoop Distributed File … Continue reading

Posted in Hadoop, PDW/APS, SQLServerPedia Syndication | 9 Comments