HomeBig DataThe Modern Data Warehouse

Comments

The Modern Data Warehouse — 8 Comments

  1. Great article James, I’m looking heavily into this area right now with a view to migrating an existing ETL framework into a hybrid one encompassing both old and new architectures. Particularly I would like to replace are traditional staging process with one using hadoop, both for structured and semi-structured data. The appeal of schema-on-read rather than write greatly appeals on this!

    What are your thoughts on that? Would you envisage a hadoop based staging process for ALL data, not just semi/non structured being a viable option, or would you still err on staging relational data in a more traditional staging rdbms?

    Thanks

    • Glad you liked the article Mike! Hadoop can definitely be a good staging area for relational data. My thoughts being I can export files from an OLTP source, store them in Hadoop, then import them into my data warehouse. So Hadoop will manage these export files, allowing you to store them on low-cost storage and have redundancy of the files. But to be clear I would still import them into a staging area in my data warehouse, and then transform the data, using the power of the DW/MPP. So you would be using ELT instead of ETL.

  2. As usual, good stuff. Is there any particular reason the marketing schema never becomes a star schema and stays in a 3NF schema? I always thought the PowerPivot engine performs best using a star schema. Thanks.

    • The reason is just to show PowerPoint does not require a star schema – it can go against a 3NF, but you are correct it will be faster if going against a star schema.

  3. Pingback:What is a data lake? | James Serra's Blog

  4. Pingback:Data Warehouse Best Practice Architecture | Easy Architecture Fan

  5. Data warehouse modernization takes many forms. Many users are diversifying their software portfolios, while others are even decommissioning current DW platforms in order to replace them with modern ones optimized for today’s requirements in big data, analytics, real time, and cost control.

Leave a Reply

Your email address will not be published. Required fields are marked *

HTML tags allowed in your comment: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>