Power BI: Dataflows

Dataflows, previously called Common Data Service for Analytics as well as Datapools, will be in preview soon and I wanted to explain in this blog what it is and how it can help you get value out of your data quickly (it’s a follow-up to my blog Getting value out of data quickly).

In short, Dataflows integrates data lake and ETL technology directly into Power BI, so anyone with Power Query skills (yes – Power Query is now part of Power BI service and not just Power BI Desktop and is called Power Query online) can create, customize and manage data within their Power BI experience (think of it as self-service data prep).  Dataflows include a standard schema, called the Common Data Model (CDM), that contains the most common business entities across the major functions such as marketing, sales, service, finance, along with connectors that ingest data from the most common sources into these schemas.  This greatly simplifies modeling and integration challenges (it prevents multiple metadata/definition on the same data).  You can also extend the CDM by creating custom entities.  Lastly – Microsoft and their partners will be shipping out-of-the-box applications that run on Power BI that populate data in the Common Data Model and deliver insights through Power BI.

A dataflow is not just the data itself, but also logic on how the data is manipulated.  Dataflows belong to the Data Warehouse/Mart/Lake family.  Its main job is to aggregate, cleanse, transform, integrate and harmonize data from a large and growing set of supported on-premises and cloud-based data sources including Dynamics 365, Salesforce, Azure SQL Database, Excel, SharePoint.  Dataflows hold a collection of data-lake stored entities (i.e. tables) which are stored in internal Power BI Common Data Model compliant folders in Azure Data Lake Storage Gen2.

This adds two new layers to Power BI (Dataflows and Storage):

But you can instead use your own Azure Data Lake Store Gen2, allowing other Azure services to reuse the data (i.e. Azure Databricks can be used to manipulate the data).

You can also setup incremental refresh for any entity, link to entities from other dataflows, and can pull data down from the dataflows into Power BI desktop.

To use dataflows, in the Power BI Service, under a Workspace: Create – Dataflow – Add entities: This starts online Power Query and you then choose a connector from one of the many data sources (just like you do with Power Query in Power BI Desktop).  Then choose a table to import and the screen will look like this:

To create a dashboard from these entities, in Power BI Desktop you simply choose Get Data -> Power BI dataflows.

The bottom line is Power BI users can now easily create a dataflow to prepare data in a centralized storage, using a standardized schema, ready for easy consumption, reuse, and generation of business insights.

Dataflows are a great way to have a power user get value out of data without involving IT.  But while this adds enterprise tools to Power BI, it does not mean you are creating an enterprise solution.  You still may need to create a data warehouse and cubes: See The need for having both a DW and cubes and Is the traditional data warehouse dead?.

More info:

Self-service data prep with dataflows

Microsoft Common Data Services

Video Introduction to Common Data Service For Analytics

Video Common Data Service for Analytics (CDS-A) and Power BI – an Introduction

Power BI expands self-service prep for big data, unifies modern and enterprise BI

Video Introducing: Advanced data prep with dataflows—for unified data and powerful insights

Dataflows in Power BI: A Data Analytics Gamechanger?

Video Introduction to the Microsoft Common Data Model

Video Power BI data preparation with Dataflows

Power BI Dataflows, and Why You Should Care

Terminology Check – What are Data Flows?

Dataflows in Power BI

Dataflows: The Good, the Bad, the Ugly

Lego Bricks and the Spectrum of Data Enrichment and Reuse

About James Serra

James is a big data and data warehousing solution architect at Microsoft. Previously he was an independent consultant working as a Data Warehouse/Business Intelligence architect and developer. He is a prior SQL Server MVP with over 25 years of IT experience.
This entry was posted in Power BI, SQLServerPedia Syndication. Bookmark the permalink.

7 Responses to Power BI: Dataflows

  1. Thanks James.

    So dataflows are a data prep tool for data analysts. They use dataflows to populate data into a Common Model in the data lake? Why not just use data already in the data lake? Do its connectors flow data into the Common MOdel automatically or do the analysts have to model that?

    Also, for enterprise modeling/integration, people will use Data Factory and Data Bricks,right?

    • James Serra says:

      Hi Wayne,

      Yes, dataflows populate data into a Common Model in the data lake via an internal PowerBI data lake. You can share these data lakes or use one that already had data in it. The way to think about this new feature is a analyst can move data into a data lake they create without having to get IT involved. The analyst will match up the columns from the data they are bringing in with the Common Model. When creating an enterprise solution, you will most likely still want to use Azure Data Factory for ETL and Databricks for data prep.

  2. Pingback: Dataflows In Power BI – Curated SQL

  3. Pingback: Dataflows in Power BI: A Data Analytics Gamechanger? - Microsoft Dynamics 365 Community

  4. Ignacio says:

    Is this feature on free preview in order to try?
    Will pro users be able to use it or only premium?

  5. Great information. Since last week, I am gathering details about the SQL experience.
    There are some amazing details on your blog which I didn’t know. Thanks.