HomeData LakeIs the traditional data warehouse dead?

Comments

Is the traditional data warehouse dead? — 10 Comments

  1. Well, that sure matches my understanding of things as of about two years ago, and since then I’ve been mostly out of it, I wondered if anything had changed. Sounds like not so much!

    Two questions though. First, what if you really do have “big” data, say 100tb? SQL Server does not scale operationally to that size, no matter that it’s within official capacity. Second, what about Azure CosmosDB or similar offerings, that are supposed to scale up there? Or other non-standard data warehouse engines like Oracle Exadata?

    Thanks.

  2. Pingback:Is the traditional data warehouse dead? | No. Betteridge’s Law

  3. Now the question is, right now, can’t a traditional dw feed data scientists for their analysis most of the times? I think big data only applies to businesses that deal with large amounts of data. But right now that’s a minority and in my opinion most businesses that want to run advanced machine learning algorithms on their data do not really need a big data platform to do so.

  4. You have a variety of points that show how Hadoop can’t do the job of a data warehouse … but aren’t those only valid when considering a Hadoop installation that one has to actually maintain?

    What about the new variety of cloud-based data lake services that I think do away with many of the negative aspects of running a Hadoop cluster? For example Azure Data Lake Analytics capabilities probably could answer at least half the ‘cons’ listed for running Hadoop.

    • Hi Christian,

      This blog was focused on my thoughts of using Hadoop as your data warehouse, so I did not mention Azure Data Lake Analytics (ADLA) as that is not a Hadoop solution. But yes, ADLA can do away with some of the cons listed for Hadoop, as I talk about in my blog at https://www.jamesserra.com/archive/2017/10/use-cases-of-various-products/. However you would not use ADLA with Azure Data Lake Store (ADLS) as a data warehouse for many reasons, such as ADLA is a job/batch service and not interactive and is not a relational database solution. You instead would use ADLS to clean and refine data sitting in ADLS just like would do with a Hadoop solution like HDInsight. The data in the data lake can then be used by power users and data scientists. Then the data in ADLS would be moved to a relational database to be used by most other users.

  5. Pingback:The Need For Multiple Warehouse Architectures – Curated SQL

  6. Hi Wayne,

    Some vendor’s do support ANSI SQL but the devil is in the details. It may be that just some commands are ANSI SQL compliant, or they have full compliance but for an older ANSI SQL version, as there are many of them: SQL-86 SQL-89 SQL-92 SQL:1999 SQL:2003 SQL:2006 SQL:2008 SQL:2011 SQL:2016.

  7. Pingback:Data News – 02 / 2018 | workingondata

Leave a Reply to JRStern Cancel reply

Your email address will not be published. Required fields are marked *

HTML tags allowed in your comment: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>