developer tip

데이터베이스의 수평 및 수직 확장의 차이점

copycodes 2020. 10. 2. 22:50
반응형

데이터베이스의 수평 및 수직 확장의 차이점


많은 NoSQL 데이터베이스와 SQL 데이터베이스를 접했습니다. 이러한 데이터베이스의 강점과 약점을 측정하기위한 다양한 매개 변수가 있으며 확장 성은 그중 하나입니다. 이러한 데이터베이스의 수평 확장과 수직 확장의 차이점은 무엇입니까?


수평 적 확장 은 리소스 풀에 더 많은 머신추가하여 확장하는 것을 의미하는 반면, 수직적 확장은 기존 머신에 더 많은 전력 (CPU, RAM)을 추가하여 확장하는 것을 의미합니다 .

이것을 기억하는 쉬운 방법은 서버 랙에있는 기계를 생각하는 것입니다. 수평 방향으로 기계를 더 추가하고 수직 방향으로 기계에 더 많은 자원을 추가 합니다.

                  수평 확장 / 수직 확장 시각화

데이터베이스 세계에서 수평 확장은 종종 데이터 분할을 기반으로합니다. 즉, 각 노드는 데이터의 일부만 포함하고, 수직 확장에서는 데이터가 단일 노드에 있으며, 확장은 다중 코어를 통해 수행됩니다. 해당 시스템의 CPU 및 RAM 리소스.

수평 확장을 사용하면 기존 풀에 더 많은 시스템을 추가하여 동적으로 확장하는 것이 더 쉬운 경우가 많습니다. 수직 확장은 종종 단일 시스템의 용량으로 제한되며, 해당 용량을 초과하는 확장에는 종종 다운 타임이 수반되며 상한이 따릅니다.

Good examples of horizontal scaling are Cassandra, MongoDB, Google Cloud Spanner .. and a good example of vertical scaling is MySQL - Amazon RDS (The cloud version of MySQL). It provides an easy way to scale vertically by switching from small to bigger machines. This process often involves downtime.

In-Memory Data Grids such as GigaSpaces XAP, Coherence etc.. are often optimized for both horizontal and vertical scaling simply because they're not bound to disk. Horizontal-scaling through partitioning and vertical-scaling through multi-core support.

You can read more on this subject in my earlier posts: Scale-out vs Scale-up and The Common Principles Behind the NOSQL Alternatives


In simple words:

Scaling horizontally ===> Thousands of minions will do the work together for you.

Scaling vertically ===> One big hulk will do all the work for you.


Let's start with the need for scaling that is increasing resources so that your system can now handle more requests than it earlier could.

When you realise your system is getting slow and is unable to handle the current number of requests, you need to scale the system.

This provides you with two options. Either you increase the resources in the server which you are using currently, i.e, increase the amount of RAM, CPU, GPU and other resources. This is known as vertical scaling.

Vertical scaling is typically costly. It does not make the system fault tolerant, i.e if you are scaling application running with single server, if that server goes down, your system will go down. Also the amount of threads remains the same in vertical scaling. Vertical scaling may require your system to go down for a moment when process takes place. Increasing resources on a server requires a restart and put your system down.

Another solution to this problem is increasing the amount of servers present in the system. This solution is highly used in the tech industry. This will eventually decrease the request per second rate in each server. If you need to scale the system, just add another server, and you are done. You would not be required to restart the system. Number of threads in each system decreases leading to high throughput. To segregate the requests, equally to each of the application server, you need to add load balancer which would act as reverse proxy to the web servers. This whole system can be called as a single cluster. Your system may contain a large number of requests which would require more amount of clusters like this.

Hope you get the whole concept of introducing scaling to the system.


There is an additional architecture that wasn't mentioned - SQL-based database services that enable horizontal scaling without the complexity of manual sharding. These services do the sharding in the background, so they enable you to run a traditional SQL database and scale out like you would with NoSQL engines like MongoDB or CouchDB. Two services I am familiar with are EnterpriseDB for PostgreSQL and Xeround for MySQL. I saw an in-depth post by Xeround which explains why scale-out on SQL databases is difficult and how they do it differently - treat this with a grain of salt as it is a vendor post. Also check out Wikipedia's Cloud Database entry, there is a nice explanation of SQL vs. NoSQL and service vs. self-hosted, a list of vendors and scaling options for each combination. ;)


Yes scaling horizontally means adding more machines, but it also implies that the machines are equal in the cluster. MySQL can scale horizontally in terms of Reading data, through the use of replicas, but once it reaches capacity of the server mem/disk, you have to begin sharding data across servers. This becomes increasingly more complex. Often keeping data consistent across replicas is a problem as replication rates are often too slow to keep up with data change rates.

Couchbase is also a fantastic NoSQL Horizontal Scaling database, used in many commercial high availability applications and games and arguably the highest performer in the category. It partitions data automatically across cluster, adding nodes is simple, and you can use commodity hardware, cheaper vm instances (using Large instead of High Mem, High Disk machines at AWS for instance). It is built off the Membase (Memcached) but adds persistence. Also, in the case of Couchbase, every node can do reads and writes, and are equals in the cluster, with only failover replication (not full dataset replication across all servers like in mySQL).

Performance-wise, you can see an excellent Cisco benchmark: http://blog.couchbase.com/understanding-performance-benchmark-published-cisco-and-solarflare-using-couchbase-server

Here is a great blog post about Couchbase Architecture: http://horicky.blogspot.com/2012/07/couchbase-architecture.html


Traditional relational databases were designed as client/server database systems. They can be scaled horizontally but the process to do so tends to be complex and error prone. NewSQL databases like NuoDB are memory-centric distributed database systems designed to scale out horizontally while maintaining the SQL/ACID properties of traditional RDBMS.

For more information on NuoDB, read their technical white paper.


SQL databases like Oracle, db2 also support Horizontal scaling through Shared disk cluster. For example Oracle RAC, IBM DB2 purescale or Sybase ASE Cluster edition. New node can be added to Oracle RAC system or DB2 purescale system to achieve horizontal scaling.

But the approach is different from noSQL databases (like mongodb, CouchDB or IBM Cloudant) is that the data sharding is not part of Horizontal scaling. In noSQL databases data is shraded during horizontal scaling.


All the other answers seem already quite complete but I did not see Google Cloud Spanner as an example of a relational database with horizontal scaling, that's why I am adding my little contribution.


Adding lots of load balancers creates extra overhead and latency and that is the drawback for scaling out horizontally in nosql databases. It is like the question why people say RPC is not recommended since it is not robust.

I think in a real system we should use both sql and nosql databases to utilize both multicore and cloud computing capabilities of today's systems.

On the other hand, complex transactional queries has high performance if sql databases such as oracle being used. NoSql could be used for bigdata and horizontal scalability by sharding.


The accepted answer is spot on the basic definition of horizontal vs vertical scaling. But unlike the common belief that horizontal scaling of databases is only possible with Cassandra, MongoDB, etc I would like to add that horizontal scaling is also very much possible with any traditional RDMS; that too without using any third party solutions.

I know of many companies, specially SaaS based companies that do this. This is done using simple application logic. You basically take a set of users and divide them over multiple DB servers. So for example, you would typically have a "meta" database/table that would store clients, DB server/connection strings, etc and a table that stores client/server mapping.

Then simply direct requests from each client to the DB server they are mapped to.

Now some may say this is akin to horizontal partitioning and not "true" horizontal scaling and they will be right in some ways. But the end result is that you have scaled your DB over multiple Db servers.

수평 확장에 대한 두 가지 접근 방식의 유일한 차이점은 확장이 DB 소프트웨어 자체에 의해 수행되는 한 가지 접근 방식 (MongoDB 등)입니다. 그런 의미에서 스케일링을 "구매"하는 것입니다. 다른 접근 방식 (RDBMS 수평 확장의 경우)에서는 확장이 애플리케이션 코드 / 로직에 의해 구축됩니다.

구매 vs 빌드

참고 URL : https://stackoverflow.com/questions/11707879/difference-between-scaling-horizontally-and-vertically-for-databases

반응형