Jambo, I'm Julia.

Database Scaling Strategies


7 min read

Continuing into my deep dive of databases, I've been learning more about database scaling strategies. It took a while for me to understand how terms like sharding and replication fit together into my overall mental model, and specifically how these related to relational vs. non-relational (NoSQL) databases. Side note here, that if you're fairly new to databases, you might first want to read my post on Relational vs Non-Relational Databases.

Database scalability is the ability of a database to expand or contract its computational resources based on the changing demands of an application. Another commonly used terms for this is database elasticity.

The scaling of database resources is typically classified into vertical and horizontal scaling. Whilst both of these scaling methods can be applied to SQL and NoSQL databases, specific strategies are more suited to one vs. the other. Before I delve into what these are, I first want to talk about other simpler scaling strategies, some of which should probably be implemented first as an application starts scaling.

Improving performance on database read queries

The 3 main strategies to discuss here are:

  • Caching;
  • Database indexing; and
  • Database replication.


In the early days of an application's growth, it is likely that the majority of your database interactions will be down to a few specific read queries. If you're able to identify these, caching the most common queries in memory on the web server could really help reduce the number of calls needed to the database.

The important thing here is to choose the appropriate cache invalidation period to ensure the data in your cache doesn't go stale.

Redis is an example of a popular in-memory cache.

Database indexing

Well-defined database indexes can significantly speed up query searches. By applying a database index on a column that our application is typically queried by, e.g. user_email, we can lower our querying time from linear time (O(n)) to logarithmic time (Olog(n)) as indexing typically utilises binary search.

Database indexing applies to relational databases and some NoSQL databases liked MongoDB.

Horizontal scaling - Replication

Replication is a horizontal scaling (more on that in the next section) strategy where identical copies of a database are created on multiple machines. The key reason for implementing replication is to store backups of up-to-date data, thus increasing availability in the event of say, a node failure resulting in one or multiple databases going down.

Another benefit of replication is that is allows us to practice Command Query Responsibility Segregation (CQRS). There are a lot of other resources you can find which will go into detail on this, but in essence, this is about splitting the write and read operations to different database replicas.

There are 2 common replication strategies, namely primary-to-secondary, and peer-to-peer replication.

Primary-to-secondary replication

A common CQRS implementation strategy is to have a primary "write" database, which is replicated into various secondary "read" databases (read replicas). Databases will not be strictly consistent, but instead, "eventually consistent" since there will be some delay before the latest writes to the primary database is replicated (more on the CAP theorem in my post on Relational vs Non-Relational Databases). This is known as replication lag (the time difference between when a write operation on the primary database is replicated in the read replica). This can result in stale / inconsistent data returned to clients when the read replica is queried.

This strategy used to be referred to as the Master-Slave Replication model but this terminology is being phased out (read MySQL's blog post about it here). Allowing read queries to be spread out amongst a number of secondary databases reduces the load on the primary database, whilst resulting in faster read results. We can also have different read queries routed to different replicas, spreading the load more efficiently.

In the event of any one node or database failing, other databases will be able to pick up the additional read queries, or be promoted to the role of primary "write" database. Note however, that if the primary database fails whilst performing a write, this information will likely be lost if it hasn't yet been replicated on the secondary databases.

Peer-to peer-replication

This is another replication strategy, although one that no longer falls within CQRS. A downside to primary-to-secondary replication is that writes don't scale up since they're all done on the primary database. Peer-to-peer replication fixes that by allowing any node / database to both read and write data. The nodes will then need to coordinate in order to ensure all data is synchronised.

As a final note, both replication strategies can be applied to SQL and NoSQL databases.

Vertical and horizontal scaling

In the section above, we talked about strategies for improving performance for read operations on databases, and touched briefly on how replication is used to store backups of up-to-date data. Let's now talk about how we can increase our available resources to better handle data storage and further improve our query handling, through horizontal and vertical scaling.

Vertical scaling

Vertical scaling (or scaling up) is done by adding resources to an individual database server (CPU and memory). Whilst both relational and non-relational databases can be scaled vertically, it is more ideal for relational databases.


  • No code changes are needed since databases read and write operations will be applied in the same way.
  • It is a simple strategy in the sense that all data lives in a single server, with no need to manage multiple instances.


  • Hardware capability will impose a maximum limit on vertical scalability at some point. Hardware will also get increasingly expensive, another limiting factor.
  • Increasing vertical scalability does not help with improving the performance of multiple queries simultaneously.
  • Migrations become increasingly difficult and may require taking the database offline.
  • Not very resilient since we're dealing with a single server.

Horizontal scaling - Sharding

Horizontal scaling (or scaling out) is achieved by adding more machines to our cluster. Two commonly used horizontal scaling techniques are (i) replication (which we discussed above); and (ii) horizontal partitioning (or sharding).

Sharding differs from replication in that each machine (or server) is only responsible for a subset of the data (data shard) it stores. Any data request will first need to go through a hashing process, so that it can be redirected to the server which contains the corresponding data shard.

Sharding is best suited to NoSQL databases. Whilst it can be applied to relational databases, it is much more difficult to implement as they need to adhere to strict Consistency and Atomicity rules. As it is harder to perform cross-shard queries and joins, sharding might limit the types of queries that can be performed.


  • Increased performance as each database can both write and read queries in parallel.
  • No need for downtime whenever a new machine is added.
  • Cheaper when compared to vertical scaling at higher levels.
  • Increased resilience and availability since only a portion of an overall dataset will be unavailable if a shard goes offline.


  • Additional complexity due to need for a load balancer to distribute requests and redirect them to the right shard.
  • It can be difficult to chose the right sharding key, leading to inconsistent load distribution across servers.
  • Difficult to join data across servers. It is thus better suited to non-relational databases that store data in self-contained objects like key-value pairs and JSON documents.
  • Only eventual consistency is possible. It is thus not suited to transactions requiring atomicity.

NoSQL Databases

Another potential workaround to sharding might be to migrate a subset of the data that does not rely on the relational model, to NoSQL. NoSQL databases are easier to scale horizontally, and offer high write rates. The downside is more flexible schemas.

Separating the databases in such a way would reduce the read load on the relational database, allowing more resources to be allocated to complex queries.

© 2016-2023 Julia Tan · Powered by Next JS.