What is HDInsight?

There are two flavors of HDInsight: Windows Azure HDInsight Service and Microsoft HDInsight Server for Windows (recently quietly killed but lives on in a different form).  Both were developed in partnership with Hadoop software developer and distributor Hortonworks and were made generally available in October, 2013.

Windows Azure HDInsight Service (try) is a service that deploys and provisions Apache Hadoop clusters in the Azure cloud, providing a software framework designed to manage, analyze and report on big data.  It makes the HDFS/MapReduce software framework and related projects such as Pig, Sqoop and Hive available in a simpler, more scalable, and cost-efficient environment.  It uses Windows Azure Blob storage as the default file system (or you can store it in the native Hadoop Distributed File System (HDFS) file system that is local to the compute nodes).

Untitled picture

Microsoft HDInsight Server for Windows was killed shortly after it was released but lives on in two flavors: Hortonworks Data Platform (HDP) (try) and Microsoft’s Parallel Data Warehouse (PDW).  Both are on-premise solutions.  With HDP, it includes core Hadoop (meaning the HDFS and MapReduce), plus Pig for MapReduce programming, Hive data query infrastructure, Hortonworks’ recently introduced HCatalog table management service for access to Hadoop data, Scoop for data movement, and the Ambari monitoring and management console.  All of the above have been reengineered to run on Windows and all are open-source components that are compatible with Apache Hadoop and are being contributed back to the community.  With PDW, you can add an HDInsight region into the appliance, and this region includes HDP and can be accessed via Polybase.

The HDInsight Server is designed to work with (but does not include) Windows Server and Microsoft SQL Server.  In the case of Windows, HDInsight is integrated with Microsoft System Center for administrative control and Active Directory for access control and security.

Untitled picture

More info:

Microsoft Releases Hadoop On Windows

Hadoop and HDInsight: Big Data in Windows Azure

Video HDInsight: Introduction to Hadoop on Windows

Video Introduction To Windows Azure HDInsight Service

Let there be Windows Azure HDInsight

Working With Data in Windows Azure HDInsight Service

HDInsight patterns & practices Windows Azure Guidance

Hortonworks Makes HDP 2.0 for Windows Server Generally Available

Windows Azure HDInsight supports preview clusters of Hadoop 2.2

Windows Azure HDInsight Supporting Hadoop 2.2 in Public Preview

About James Serra

James currently works for Microsoft specializing in big data and data warehousing using the Analytics Platform System (APS), a Massively Parallel Processing (MPP) architecture. Previously he was an independent consultant working as a Data Warehouse/Business Intelligence/MDM architect and developer, specializing in the Microsoft BI stack. He is a SQL Server MVP with over 25 years of IT experience.
This entry was posted in Hadoop, PDW/APS, SQLServerPedia Syndication. Bookmark the permalink.

5 Responses to What is HDInsight?

  1. Steven Neumersky says:

    Thank you very much for the “SQOOP” and the Annoying Recruiters saga.
    Congratulations on your new position as well.

    Did I mention I like the Annoying Recruiters saga?

  2. Pingback: Parallel Data Warehouse (PDW) benefits made simple | James Serra's Blog

  3. Pingback: Parallel Data Warehouse (PDW) AU1 released - SQL Server - SQL Server - Toad World

  4. Pingback: What is the Microsoft Analytics Platform System (APS)? - SQL Server - SQL Server - Toad World

  5. Pingback: Non-obvious APS/PDW benefits - SQL Server - SQL Server - Toad World

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>