What is Polyglot Persistence?

Polyglot Persistence is a fancy term to mean that when storing data, it is best to use multiple data storage technologies, chosen based upon the way data is being used by individual applications or components of a single application.  Different kinds of data are best dealt with different data stores.  In short, it means picking the right tool for the right use case.  It’s the same idea behind Polyglot Programming, which is the idea that applications should be written in a mix of languages to take advantage of the fact that different languages are suitable for tackling different problems.

Looking at a Polyglot Persistence example, an e-commerce platform will deal with many types of data (i.e. shopping cart, inventory, completed orders, etc).  Instead of trying to store all this data in one database, which would require a lot of data conversion to make the format of the data all the same, store the data in the database best suited for that type of data.  So the e-commerce platform might look like this:

pp

So we are using a mixture of RDBMS solutions (i.e. SQL Server) with NoSQL solutions of which there are four types: Key-Value, Document, Graph, Column (see Types of NoSQL databases).  A guideline on the database type to use based on the functionality of the data:

Functionality Considerations Database Type
User Sessions Rapid Access for reads and writes.  No need to be durable. Key-Value
Financial Data Needs transactional updates.  Tabular structure fits data. RDBMS
POS Data Depending on size and rate of ingest.  Lots of writes, infrequent reads mostly for analytics. RDBMS (if modest), Key Value or Document (if ingest very high) or Column if analytics is key.
Shopping Cart High availability across multiple locations.  Can merge inconsistent writes. Document, (Key Value maybe)
Recommendations Rapidly traverse links between friends, product purchases, and ratings. Graph, (Column if simple)
Product Catalog Lots of reads, infrequent writes.  Products make natural aggregates. Document
Reporting SQL interfaces well with reporting tools RDBMS, Column
Analytics Large scale analytics on large cluster Column
User activity logs, CSR logs, Social Media analysis High volume of writes on multiple nodes Key Value or Document

With an application that uses many types of data, a web service can be created to send the data request to the appropriate database:

pp2

This will come at a cost in complexity, as each data storage solution means learning a new technology.  But the benefits will be worth it, as when relational databases are using inappropriately, they will cause a significant slowdown in application development and performance.  Another benefit is many NoSQL database are designed to operate over clusters and can handle large volumes of data, so it gives you horizontal scaling (scale-out) as opposed to the limitation with most relational databases that use vertical scaling (scale-up).

More info:

Polyglot Persistence – Two Great Tastes That Taste Great Together

Introduction to Polyglot Persistence: Using Different Data Storage Technologies for Varying Data Storage Needs

Webinar: What, Where, and How of Polyglot Persistence

Spring Polyglot Persistent Applications Part 1

The Rise of NoSQL and Polyglot Persistence

Polyglot Persistence: Choosing the Right Azure Storage Mix

What’s better for your big data application, SQL or NoSQL?

Difference between SQL and NoSQL : Comparision

Polyglot Persistence?

Data Access for Highly-Scalable Solutions: Using SQL, NoSQL, and Polyglot Persistence

PolyglotPersistence

About James Serra

James is a big data and data warehousing solution architect at Microsoft. Previously he was an independent consultant working as a Data Warehouse/Business Intelligence architect and developer. He is a prior SQL Server MVP with over 25 years of IT experience.
This entry was posted in Big Data, SQLServerPedia Syndication. Bookmark the permalink.

11 Responses to What is Polyglot Persistence?

  1. This is a very clear and helpful post. I’ve done this (I put a link in above with my blog on how we did it) and would add that there is a lot of integration complexity in addition to learning the extra technologies.

    So there’s a balance between getting the benefits like a natural data model and horizontal scale out for big data and the DevOps overhead of multiple data stores.

  2. Pingback: Relational databases vs Non-relational databases | James Serra's Blog

  3. Pingback: SQL یا NoSQL مسئله این است !

  4. Peter Heller says:

    How does this ties to U-SQL in AZURE?
    Also, what modeling tools will have handle this Polyglot Persistence storage design?

  5. Charles ellis O'Riley Jr. says:

    Thanks James. As someone new to Cypher and graph databases, this is very informative.

  6. Pingback: From Relational to Graph: A Developer's Guide – Technology Up2date

  7. Pingback: Project Alpha: Getting started | niklastanskanen

  8. Pingback: Cloud Data Services Sprawl … it’s Complicated | Iguazio

  9. Pingback: Polyglot Persistence and friends – ABSTRACTIVE Learning

  10. Pingback: What is Polyglot persistence? – Narasimha Chinimilly