For companies that sell an on-prem software solution and are looking to move that solution to the cloud, a challenge arises on how to architect that solution in the cloud. For example, say you have a software solution that stores patient data for hospitals. You sign up hospitals, install the hardware and software and the associated databases on-prem (at the hospital or a co-location facility), and load their patient data. Think of each hospital as a “tenant”. Now you want to move this solution to the cloud and get the many benefits that come with it, the biggest being the time to get a hospital up and running, which can go from months on-prem to hours in the cloud. Now you have some choices: keep each hospital separate with their own VMs and databases (“single tenant”), or combining the data for each hospital into one database (“multi-tenant”). For another example, you can simply be creating a PaaS application similar to Salesforce. Here I’ll describe the various cloud strategies:
You create VMs for each tenant, essentially doing a “lift and shift” of the current on-premise solution. This provides the best isolation possible and it’s regularly done on-premises, but it’s also the one that doesn’t enable cutting costs, since each tenant has it’s own server, sql, license and so on. Sometimes this is the only allowable option if you have in your client contract that their data will be hardware-isolated from other clients. Some cons: table updates must be replicated across all the servers (i.e. updating reference tables), there is no resource sharing, and you need multiple backup strategies across all the servers.
A new database is created and assigned when a tenant is provisioned. You can land a number of the databases on each VM (i.e. each VM handles ten tenants), or create a database using Azure SQL Database. This is often used in order if you need to provide isolation for each customer, because we can associate different logins, permissions and so on to each database. If using Azure SQL Database, be aware the database size limit is 1TB. If you have a client database that will exceed that, you can use sharding (via Elastic Database Tools) or use cross-database queries (see Scaling Azure SQL Database) with row-level security (see Multi-tenant applications with elastic database tools and row-level security). The minimum size for a database for SQL Database 1GB, so you might be paying for storage that you don’t really use. If using Azure SQL Data Warehouse, you have no limit on database size. Some other cons: A different connection pool is required per database, updates must be replicated across all the databases, there is no resource sharing (unless using Elastic Database Pools) and you need multiple backup strategies across all the databases.
Also a very good way to achieve multi-tenancy but at the same time share some resources since everything is inside the same database but the schemas used are different, one for each tenant. That allows you to even customize a specific tenant without affecting others. And you save costs by only paying for one database (which can fit on SQL Data Warehouse not matter what the size) or a handful of databases if using SQL Database (i.e. ten tenants per database). Some of the cons: You need to replicate all the database objects in every schema, so the number of objects can increase indefinitely, updates must be replicated across all the schemas, the connection pool for the database must maintain a different connection per tenant (or set of credentials), a different user is required per tenant (which is stored at server level) and you have to backup that user independently.
A variation of this using SQL Database is to split the tenants over multiple databases, but not to use separate schemas for performance reasons. The is done by assigning a distinct set of tenants to each database using a partitioning strategy such as hash, range or list partitioning. This data distribution strategy is oftentimes referred to as sharding.
Everything is shared in this option, server, database and even schema. All the data for the tenants are within the same tables in one database. The only way they are differentiated is based on a TenantId or some other column that exists on the table level. Another big benefit is code changes: with this option you only have one spot to change code (i.e. table structure). With the other options you will have to roll out code changes to many spots. You will need to use row-level security or something similar when you need to limit the results to an individual tenant. Or you can create views or use stored procedures to filter tenants. You also have the benefit of ease-of-use and performance when you need to aggregate results over multiple tenants. Azure SQL Data Warehouse is a great solution for this, as there is no limit to the database size. But be aware that there is a limit of 32 concurrent queries and 1,024 concurrent connections, so if you have thousands of users who will be hitting the database at the same time, you may want to create data marts or SSAS cubes.
A great article that discusses the various multi-tenant models in detail and how multi-tenancy is supported with Azure SQL Database is Design Patterns for Multi-tenant SaaS Applications with Azure SQL Database.
As you can see, there are lot’s of options to consider! It becomes a balance of cost, performance, ease-of-development, east-of-use, and security.