Category Archives: Data Lake

What is a data lake?

A “data lake” is a storage repository, usually in Hadoop, that holds a vast amount of raw data in its native format until it is needed.  It’s a great place for investigating, exploring, experimenting, and refining data, in addition to archiving … Continue reading

Posted in Data Lake, Data warehouse, Hadoop, SQLServerPedia Syndication | 11 Comments

The Modern Data Warehouse

The traditional data warehouse has served us well for many years, but new trends are causing it to break in four different ways: data growth, fast query expectations from users, non-relational/unstructured data, and cloud-born data.  How can you prevent this … Continue reading

Posted in Big Data, Data Lake, Data warehouse, Hadoop, PDW/APS, SQLServerPedia Syndication | 8 Comments

Hadoop and Data Warehouses

I see a lot of confusion when it comes to Hadoop and its role in a data warehouse solution.  Hadoop should not be a replacement for a data warehouse, but rather should augment/complement a data warehouse.  Hadoop and a data warehouse … Continue reading

Posted in Data Lake, Data warehouse, Hadoop, PDW/APS, PolyBase, SQLServerPedia Syndication | 5 Comments