What is Azure Data Factory?

The Azure Data Factory (ADF) is a service designed to allow developers to integrate disparate data sources.  It is a platform somewhat like SSIS in the cloud to manage the data you have both on-prem and in the cloud.

It provides access to on-premises data in SQL Server and cloud data in Azure Storage (Blob and Tables) and Azure SQL Database.  Access to on-premises data is provided through a data management gateway that connects to on-premises SQL Server databases.

It is not a drag-and-drop interface like SSIS.  Instead, data processing is enabled initially through Hive, Pig and custom C# activities.  Such activities can be used to clean data, mask data fields, and transform data in a wide variety of complex ways.

You will author your activities, combine them into a pipeline, set an execution schedule and you’re done.  Data Factory also provides an up-to-the moment monitoring dashboard, which means you can deploy your data pipelines and immediately begin to view them as part of your monitoring dashboard.

Within the Azure Preview Portal, you get a visual layout of all of your pipelines and data inputs and outputs.  You can see all the relationships and dependencies of your data pipelines across all of your sources so you always know where data is coming from and where it is going.  You get a historical accounting of job execution, data production status, and system health in a single monitoring dashboard.

Data Factory provides customers with a central place to manage their processing of web log analytics, click stream analysis, social sentiment, sensor data analysis, geo-location analysis, etc.

Microsoft views Data Factory as a key tool for customers who are looking to have a hybrid story with SQL Server or who currently use Azure HDInsight, Azure SQL Database, Azure Blobs, and Power BI for Office 365.

In short, developers can use Data Factory to transform semi-structured, unstructured and structured data from on-premises and cloud sources into trusted information.

Their are various ways to create a data factory: Azure Portal, PowerShell (using Azure Resource Manager templates), Visual Studio (Azure .NET SDK 2.7 or later), REST API – Azure Data Factory SDK.

endtoendworkflow

More info:

Microsoft takes wraps off preview of its Azure Data Factory service

Data Factory Public Preview – build and manage information production pipelines

Video Azure Data Factory Overview

Introduction to Azure Data Factory Service

The Ins and Outs of Azure Data Factory – Orchestration and Management of Diverse Data

Azure Data Factory Update – New Data Stores

About James Serra

James is a big data and data warehousing solution architect at Microsoft. Previously he was an independent consultant working as a Data Warehouse/Business Intelligence architect and developer. He is a prior SQL Server MVP with over 25 years of IT experience.
This entry was posted in Cloud, SQLServerPedia Syndication. Bookmark the permalink.

4 Responses to What is Azure Data Factory?

  1. suresh says:

    can you please provide me the example to implement Merge operation using ADF

  2. Pingback: Azure data factory – Innovative Ideas

  3. Pingback: Azure Data Factory and SSIS compared | James Serra's Blog

  4. Pingback: Azure Data Factory and SSIS compared - SQL Server Blog - SQL Server - Toad World