How to use quiesce to expand a cluster vertically

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

How to use quiesce to expand a cluster vertically

Context

A key feature of Aerospike is the ability to add extra capacity to a cluster by expanding horizontally without performance penalties. At a certain point it may be preferable to replace a cluster of numerous smaller nodes with fewer large capacity nodes. This could be to reduce data centre footprint or simply to refresh with newer hardware. Swapping out nodes one by one is problematic, as the last of the smaller nodes may end up not having sufficient capacity to take their share of records within a cluster where there are fewer larger nodes. For that reason it is ideal to swap to the larger nodes in one fell swoop. This article details how this can be done using quiesce to maintain replication factor throughout the operation and without interruption to client workload.

Method

Initial State

The initial state is a 6 node cluster with some 15 million records in the bar namespace. Replication factor is 2 and the nodes are at around 40% disk usage. The cluster is stable and not migrating.

Admin> summary -l
Cluster
=======

   1.   Server Version     :  E-4.5.1.5
   2.   OS Version         :  Ubuntu 18.04.1 LTS (4.9.125-linuxkit)
   3.   Cluster Size       :  6
   4.   Devices            :  Total 6, per-node 1
   5.   Memory             :  Total 48.000 GB, 5.67% used (2.720 GB), 94.33% available (45.280 GB)
   6.   Disk               :  Total 6.000 GB, 37.45% used (2.247 GB), 61.00% available contiguous space (3.660 GB)
   7.   Usage (Unique Data):  0.000 B  in-memory, 547.218 MB on-disk
   8.   Active Namespaces  :  1 of 2
   9.   Features           :  KVS, Scan


Namespaces
==========

   test
   ====
   1.   Devices            :  Total 0, per-node 0
   2.   Memory             :  Total 24.000 GB, 0.00% used (0.000 B), 100.00% available (24.000 GB)
   3.   Replication Factor :  2
   4.   Rack-aware         :  False
   5.   Master Objects     :  0.000  
   6.   Usage (Unique Data):  0.000 B  in-memory, 0.000 B  on-disk

   bar
   ===
   1.   Devices            :  Total 6, per-node 1
   2.   Memory             :  Total 24.000 GB, 11.33% used (2.720 GB), 88.67% available (21.280 GB)
   3.   Disk               :  Total 6.000 GB, 37.45% used (2.247 GB), 61.00% available contiguous space (3.660 GB)
   4.   Replication Factor :  2
   5.   Rack-aware         :  False
   6.   Master Objects     :  15.100 M
   7.   Usage (Unique Data):  0.000 B  in-memory, 547.218 MB on-disk



Admin> info
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Network Information (2019-03-05 17:07:11 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             Node               Node                Ip       Build   Cluster   Migrations        Cluster     Cluster         Principal   Client     Uptime   
                .                 Id                 .           .      Size            .            Key   Integrity                 .    Conns          .   
0534b209248c:3000   BB9030011AC4202    172.17.0.3:3000   E-4.5.1.5         6      0.000     D8B2B3140412   True        BB9080011AC4202        2   02:47:17   
172.17.0.4:3000     BB9040011AC4202    172.17.0.4:3000   E-4.5.1.5         6      0.000     D8B2B3140412   True        BB9080011AC4202        3   02:47:18   
172.17.0.5:3000     BB9050011AC4202    172.17.0.5:3000   E-4.5.1.5         6      0.000     D8B2B3140412   True        BB9080011AC4202        3   02:47:18   
172.17.0.6:3000     BB9060011AC4202    172.17.0.6:3000   E-4.5.1.5         6      0.000     D8B2B3140412   True        BB9080011AC4202        2   02:47:18   
172.17.0.7:3000     BB9070011AC4202    172.17.0.7:3000   E-4.5.1.5         6      0.000     D8B2B3140412   True        BB9080011AC4202        5   02:47:18   
172.17.0.8:3000     *BB9080011AC4202   172.17.0.8:3000   E-4.5.1.5         6      0.000     D8B2B3140412   True        BB9080011AC4202        2   02:47:18   
Number of rows: 6

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Usage Information (2019-03-05 17:07:11 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace                Node      Total   Expirations,Evictions     Stop         Disk    Disk     HWM   Avail%          Mem     Mem    HWM      Stop   
        .                   .    Records                       .   Writes         Used   Used%   Disk%        .         Used   Used%   Mem%   Writes%   
bar         0534b209248c:3000    4.963 M   (0.000,  0.000)         false    378.614 MB   37      50      62       446.105 MB   11      60     90        
bar         172.17.0.4:3000      5.154 M   (0.000,  0.000)         false    393.222 MB   39      50      60       463.314 MB   12      60     90        
bar         172.17.0.5:3000      4.855 M   (0.000,  0.000)         false    370.378 MB   37      50      62       436.401 MB   11      60     90        
bar         172.17.0.6:3000      4.927 M   (0.000,  0.000)         false    375.895 MB   37      50      62       442.902 MB   11      60     90        
bar         172.17.0.7:3000      4.865 M   (0.000,  0.000)         false    371.138 MB   37      50      62       437.294 MB   11      60     90        
bar         172.17.0.8:3000      5.237 M   (0.000,  0.000)         false    399.540 MB   40      50      60       470.762 MB   12      60     90        
bar                             30.000 M   (0.000,  0.000)                    2.235 GB                              2.634 GB                            
test        0534b209248c:3000    0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test        172.17.0.4:3000      0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test        172.17.0.5:3000      0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test        172.17.0.6:3000      0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test        172.17.0.7:3000      0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test        172.17.0.8:3000      0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test                             0.000     (0.000,  0.000)                    0.000 B                               0.000 B                             
Number of rows: 14

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Object Information (2019-03-05 17:07:11 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace                Node      Total     Repl                         Objects                   Tombstones             Pending   Rack   
        .                   .    Records   Factor      (Master,Prole,Non-Replica)   (Master,Prole,Non-Replica)            Migrates     ID   
        .                   .          .        .                               .                            .             (tx,rx)      .   
bar         0534b209248c:3000    4.963 M   2        (2.444 M, 2.519 M, 0.000)       (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar         172.17.0.4:3000      5.154 M   2        (2.620 M, 2.534 M, 0.000)       (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar         172.17.0.5:3000      4.855 M   2        (2.464 M, 2.391 M, 0.000)       (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar         172.17.0.6:3000      4.927 M   2        (2.421 M, 2.506 M, 0.000)       (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar         172.17.0.7:3000      4.865 M   2        (2.370 M, 2.494 M, 0.000)       (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar         172.17.0.8:3000      5.237 M   2        (2.681 M, 2.556 M, 0.000)       (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar                             30.000 M            (15.000 M, 15.000 M, 0.000)     (0.000,  0.000,  0.000)      (0.000,  0.000)            
test        0534b209248c:3000    0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test        172.17.0.4:3000      0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test        172.17.0.5:3000      0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test        172.17.0.6:3000      0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test        172.17.0.7:3000      0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test        172.17.0.8:3000      0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test                             0.000              (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (0.000,  0.000)            
Number of rows: 14

Admin> 

The plan is to replace these 6 nodes with 3 nodes having larger disks. This should take place without reducing replication factor and without affecting the workflow. Here, the Aerospike java benchmark tool is being used to provide a workload of 50% read, 50% update across 100,000 keys.

root@dafbcaffcac9:~/java/aerospike-client-java-4.3.1/benchmarks# ./run_benchmarks -h 172.17.0.3 -p 3000 -n bar -k 100000 -w RU,50 -S 1 -z 10
Benchmark: 172.17.0.3 3000, namespace: bar, set: testset, threads: 10, workload: READ_UPDATE
read: 50% (all bins: 100%, single bin: 0%), write: 50% (all bins: 100%, single bin: 0%)
keys: 100000, start key: 1, transactions: 0, bins: 1, random values: false, throughput: unlimited

Add new nodes

Here the 3 new nodes have been added to the cluster and migrations are ongoing. In this test, the new nodes are the same specification other than having larger disks. This is reflected below in the excerpts from asadm:

Admin> summary -l
Cluster  (Migrations in Progress)
=================================

   1.   Server Version     :  E-4.5.1.5
   2.   OS Version         :  Ubuntu 18.04.1 LTS (4.9.125-linuxkit)
   3.   Cluster Size       :  9
   4.   Devices            :  Total 9, per-node 1
   5.   Memory             :  Total 72.000 GB, 3.94% used (2.840 GB), 96.06% available (69.160 GB)
   6.   Disk               :  Total 15.000 GB, 15.02% used (2.252 GB), 83.80% available contiguous space (12.570 GB)
   7.   Usage (Unique Data):  0.000 B  in-memory, 982.011 MB on-disk
   8.   Active Namespaces  :  1 of 2
   9.   Features           :  KVS, Scan


Namespaces
==========

   bar
   ===
   1.   Devices            :  Total 9, per-node 1
   2.   Memory             :  Total 36.000 GB, 7.89% used (2.840 GB), 92.11% available (33.160 GB)
   3.   Disk               :  Total 15.000 GB, 15.02% used (2.252 GB), 83.80% available contiguous space (12.570 GB)
   4.   Replication Factor :  2
   5.   Rack-aware         :  False
   6.   Master Objects     :  15.100 M
   7.   Usage (Unique Data):  0.000 B  in-memory, 982.011 MB on-disk


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Usage Information (2019-03-06 15:22:21 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace                Node      Total   Expirations,Evictions     Stop         Disk    Disk     HWM   Avail%          Mem     Mem    HWM      Stop   
        .                   .    Records                       .   Writes         Used   Used%   Disk%        .         Used   Used%   Mem%   Writes%   
bar         0534b209248c:3000    4.996 M   (0.000,  0.000)         false    380.648 MB   38      50      61       448.583 MB   11      60     90        
bar         172.17.0.10:3000    39.786 K   (0.000,  0.000)         false      2.966 MB   1       50      99         3.506 MB   1       60     90        
bar         172.17.0.11:3000    40.584 K   (0.000,  0.000)         false      3.031 MB   1       50      99         3.583 MB   1       60     90        
bar         172.17.0.12:3000    25.902 K   (0.000,  0.000)         false      1.908 MB   1       50      99         2.259 MB   1       60     90        
bar         172.17.0.4:3000      5.189 M   (0.000,  0.000)         false    395.327 MB   39      50      60       465.880 MB   12      60     90        
bar         172.17.0.5:3000      4.887 M   (0.000,  0.000)         false    372.362 MB   37      50      62       438.819 MB   11      60     90        
bar         172.17.0.6:3000      4.960 M   (0.000,  0.000)         false    377.889 MB   37      50      62       445.332 MB   11      60     90        
bar         172.17.0.7:3000      4.897 M   (0.000,  0.000)         false    373.103 MB   37      50      62       439.689 MB   11      60     90        
bar         172.17.0.8:3000      5.272 M   (0.000,  0.000)         false    401.666 MB   40      50      59       473.353 MB   12      60     90        
bar                             30.306 M   (0.000,  0.000)                    2.255 GB                              2.657 GB                            
test        0534b209248c:3000    0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test        172.17.0.10:3000     0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test        172.17.0.11:3000     0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test        172.17.0.12:3000     0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test        172.17.0.4:3000      0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test        172.17.0.5:3000      0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test        172.17.0.6:3000      0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test        172.17.0.7:3000      0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test        172.17.0.8:3000      0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test                             0.000     (0.000,  0.000)                    0.000 B                               0.000 B                             
Number of rows: 20

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Object Information (2019-03-06 15:22:21 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace                Node      Total     Repl                         Objects                   Tombstones                 Pending   Rack   
        .                   .    Records   Factor      (Master,Prole,Non-Replica)   (Master,Prole,Non-Replica)                Migrates     ID   
        .                   .          .        .                               .                            .                 (tx,rx)      .   
bar         0534b209248c:3000    4.996 M   2        (2.450 M, 1.076 M, 1.470 M)     (0.000,  0.000,  0.000)      (615.000,  493.000)     0      
bar         172.17.0.10:3000    39.766 K   2        (35.476 K, 4.290 K, 0.000)      (0.000,  0.000,  0.000)      (949.000,  964.000)     0      
bar         172.17.0.11:3000    40.314 K   2        (35.605 K, 4.709 K, 0.000)      (0.000,  0.000,  0.000)      (1.143 K, 953.000)      0      
bar         172.17.0.12:3000    26.089 K   2        (21.250 K, 4.839 K, 0.000)      (0.000,  0.000,  0.000)      (771.000,  939.000)     0      
bar         172.17.0.4:3000      5.189 M   2        (2.620 M, 1.153 M, 1.416 M)     (0.000,  0.000,  0.000)      (632.000,  741.000)     0      
bar         172.17.0.5:3000      4.887 M   2        (2.462 M, 1.047 M, 1.378 M)     (0.000,  0.000,  0.000)      (583.000,  665.000)     0      
bar         172.17.0.6:3000      4.960 M   2        (2.419 M, 1.050 M, 1.490 M)     (0.000,  0.000,  0.000)      (593.000,  648.000)     0      
bar         172.17.0.7:3000      4.897 M   2        (2.368 M, 1.089 M, 1.440 M)     (0.000,  0.000,  0.000)      (606.000,  553.000)     0      
bar         172.17.0.8:3000      5.272 M   2        (2.688 M, 1.038 M, 1.546 M)     (0.000,  0.000,  0.000)      (649.000,  597.000)     0      
bar                             30.306 M            (15.100 M, 6.466 M, 8.740 M)    (0.000,  0.000,  0.000)      (6.541 K, 6.553 K)             
test        0534b209248c:3000    0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (392.000,  0.000)       0      
test        172.17.0.10:3000     0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (115.000,  910.000)     0      
test        172.17.0.11:3000     0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (119.000,  897.000)     0      
test        172.17.0.12:3000     0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (113.000,  905.000)     0      
test        172.17.0.4:3000      0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (409.000,  0.000)       0      
test        172.17.0.5:3000      0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (376.000,  0.000)       0      
test        172.17.0.6:3000      0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (397.000,  0.000)       0      
test        172.17.0.7:3000      0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (376.000,  0.000)       0      
test        172.17.0.8:3000      0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (415.000,  0.000)       0      
test                             0.000              (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (2.712 K, 2.712 K)             
Number of rows: 20

Admin> 

As expected, there is no impact on client workload (note the 3 nodes discovered by the client):

2019-03-06 15:22:04.678 write(tps=3985 timeouts=0 errors=0) read(tps=4003 timeouts=0 errors=0) total(tps=7988 timeouts=0 errors=0)
2019-03-06 15:22:05.679 write(tps=1800 timeouts=0 errors=0) read(tps=1757 timeouts=0 errors=0) total(tps=3557 timeouts=0 errors=0)
2019-03-06 15:22:06.680 write(tps=1125 timeouts=0 errors=0) read(tps=1145 timeouts=0 errors=0) total(tps=2270 timeouts=0 errors=0)
2019-03-06 15:22:07.686 write(tps=1305 timeouts=0 errors=0) read(tps=1297 timeouts=0 errors=0) total(tps=2602 timeouts=0 errors=0)
2019-03-06 15:22:07.727 INFO Thread tend Add node BB90C0011AC4202 172.17.0.12 3000
2019-03-06 15:22:07.730 INFO Thread tend Add node BB90B0011AC4202 172.17.0.11 3000
2019-03-06 15:22:07.733 INFO Thread tend Add node BB90A0011AC4202 172.17.0.10 3000
2019-03-06 15:22:08.686 write(tps=1070 timeouts=2 errors=0) read(tps=1084 timeouts=0 errors=0) total(tps=2154 timeouts=2 errors=0)
2019-03-06 15:22:09.687 write(tps=1332 timeouts=0 errors=0) read(tps=1371 timeouts=0 errors=0) total(tps=2703 timeouts=0 errors=0)
2019-03-06 15:22:10.689 write(tps=1204 timeouts=0 errors=0) read(tps=1242 timeouts=0 errors=0) total(tps=2446 timeouts=0 errors=0)
2019-03-06 15:22:11.690 write(tps=1672 timeouts=0 errors=0) read(tps=1674 timeouts=0 errors=0) total(tps=3346 timeouts=0 errors=0)
2019-03-06 15:22:12.690 write(tps=1380 timeouts=0 errors=0) read(tps=1429 timeouts=0 errors=0) total(tps=2809 timeouts=0 errors=0)
2019-03-06 15:22:13.691 write(tps=1318 timeouts=0 errors=0) read(tps=1274 timeouts=0 errors=0) total(tps=2592 timeouts=0 errors=0)
2019-03-06 15:22:14.691 write(tps=1169 timeouts=0 errors=0) read(tps=1228 timeouts=0 errors=0) total(tps=2397 timeouts=0 errors=0)

Quiesce original 6 nodes

Once the new nodes are in the cluster the original 6 nodes can be quiesced. It is not necessary to wait for migrations to complete before quiescing as the quiesced nodes will not give up master ownership for their partitions until they have migrated out those partitions. The quiesce command is executed and the status checked by running pending quiesce.

Admin> asinfo -v 'quiesce:' with 172.17.0.3
0534b209248c:3000 (172.17.0.3) returned:
ok

Admin> asinfo -v 'quiesce:' with 172.17.0.4
172.17.0.4:3000 (172.17.0.4) returned:
ok

Admin> asinfo -v 'quiesce:' with 172.17.0.5
172.17.0.5:3000 (172.17.0.5) returned:
ok

Admin> asinfo -v 'quiesce:' with 172.17.0.6
172.17.0.6:3000 (172.17.0.6) returned:
ok

Admin> asinfo -v 'quiesce:' with 172.17.0.7
172.17.0.7:3000 (172.17.0.7) returned:
ok

Admin> asinfo -v 'quiesce:' with 172.17.0.8
172.17.0.8:3000 (172.17.0.8) returned:
ok

Admin> show statistics like pending_quiesce
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~bar Namespace Statistics (2019-03-06 16:28:22 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE           :   0534b209248c:3000   172.17.0.10:3000   172.17.0.11:3000   172.17.0.12:3000   172.17.0.4:3000   172.17.0.5:3000   172.17.0.6:3000   172.17.0.7:3000   172.17.0.8:3000   
pending_quiesce:   true                false              false              false              true              true              true              true              true              

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~test Namespace Statistics (2019-03-06 16:28:22 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE           :   0534b209248c:3000   172.17.0.10:3000   172.17.0.11:3000   172.17.0.12:3000   172.17.0.4:3000   172.17.0.5:3000   172.17.0.6:3000   172.17.0.7:3000   172.17.0.8:3000   
pending_quiesce:   true                false              false              false              true              true              true              true              true              

At this point nothing has happened as the quiesce command does not take effect until a recluster command is issued as follows:

Admin> asinfo -v 'recluster:'
172.17.0.8:3000 (172.17.0.8) returned:
ignored-by-non-principal

172.17.0.11:3000 (172.17.0.11) returned:
ignored-by-non-principal

0534b209248c:3000 (172.17.0.3) returned:
ignored-by-non-principal

172.17.0.4:3000 (172.17.0.4) returned:
ignored-by-non-principal

172.17.0.7:3000 (172.17.0.7) returned:
ignored-by-non-principal

172.17.0.6:3000 (172.17.0.6) returned:
ignored-by-non-principal

172.17.0.12:3000 (172.17.0.12) returned:
ok

172.17.0.5:3000 (172.17.0.5) returned:
ignored-by-non-principal

172.17.0.10:3000 (172.17.0.10) returned:
ignored-by-non-principal

Monitor migrations

The quiesced nodes will continue to take traffic as long as they are a master for a given partition. Nodes will give up the master role for a partition when it has fully migrated out to the new master node. In the interim between the master role changing and the clients receiving a new partition map, the quiesced nodes will proxy transactions. The client workload continues uninterrupted.

Admin> info

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Object Information (2019-03-06 16:30:46 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace                Node      Total     Repl                          Objects                   Tombstones               Pending   Rack   
        .                   .    Records   Factor       (Master,Prole,Non-Replica)   (Master,Prole,Non-Replica)              Migrates     ID   
        .                   .          .        .                                .                            .               (tx,rx)      .   
bar         0534b209248c:3000    3.301 M   2        (1.013 M, 0.000,  2.288 M)       (0.000,  0.000,  0.000)      (275.000,  0.000)     0      
bar         172.17.0.10:3000     3.384 M   2        (2.959 M, 424.843 K, 0.000)      (0.000,  0.000,  0.000)      (1.239 K, 1.789 K)    0      
bar         172.17.0.11:3000     3.324 M   2        (2.902 M, 421.437 K, 0.000)      (0.000,  0.000,  0.000)      (1.237 K, 1.797 K)    0      
bar         172.17.0.12:3000     3.346 M   2        (2.893 M, 453.114 K, 0.000)      (0.000,  0.000,  0.000)      (1.273 K, 1.884 K)    0      
bar         172.17.0.4:3000      3.558 M   2        (1.122 M, 0.000,  2.436 M)       (0.000,  0.000,  0.000)      (304.000,  0.000)     0      
bar         172.17.0.5:3000      3.321 M   2        (1.085 M, 0.000,  2.236 M)       (0.000,  0.000,  0.000)      (294.000,  0.000)     0      
bar         172.17.0.6:3000      3.271 M   2        (973.272 K, 0.000,  2.297 M)     (0.000,  0.000,  0.000)      (264.000,  0.000)     0      
bar         172.17.0.7:3000      3.228 M   2        (991.724 K, 0.000,  2.236 M)     (0.000,  0.000,  0.000)      (269.000,  0.000)     0      
bar         172.17.0.8:3000      3.524 M   2        (1.164 M, 0.000,  2.360 M)       (0.000,  0.000,  0.000)      (316.000,  0.000)     0      
bar                             30.256 M            (15.104 M, 1.299 M, 13.853 M)    (0.000,  0.000,  0.000)      (5.471 K, 5.470 K)           
test        0534b209248c:3000    0.000     2        (0.000,  0.000,  0.000)          (0.000,  0.000,  0.000)      (21.000,  0.000)      0      
test        172.17.0.10:3000     0.000     2        (0.000,  0.000,  0.000)          (0.000,  0.000,  0.000)      (1.155 K, 1.177 K)    0      
test        172.17.0.11:3000     0.000     2        (0.000,  0.000,  0.000)          (0.000,  0.000,  0.000)      (1.157 K, 1.073 K)    0      
test        172.17.0.12:3000     0.000     2        (0.000,  0.000,  0.000)          (0.000,  0.000,  0.000)      (1.079 K, 1.309 K)    0      
test        172.17.0.4:3000      0.000     2        (0.000,  0.000,  0.000)          (0.000,  0.000,  0.000)      (37.000,  0.000)      0      
test        172.17.0.5:3000      0.000     2        (0.000,  0.000,  0.000)          (0.000,  0.000,  0.000)      (28.000,  0.000)      0      
test        172.17.0.6:3000      0.000     2        (0.000,  0.000,  0.000)          (0.000,  0.000,  0.000)      (6.000,  0.000)       0      
test        172.17.0.7:3000      0.000     2        (0.000,  0.000,  0.000)          (0.000,  0.000,  0.000)      (34.000,  0.000)      0      
test        172.17.0.8:3000      0.000     2        (0.000,  0.000,  0.000)          (0.000,  0.000,  0.000)      (24.000,  0.000)      0      
test                             0.000              (0.000,  0.000,  0.000)          (0.000,  0.000,  0.000)      (3.541 K, 3.559 K)           
Number of rows: 20

Shutdown quiesced nodes

Once migrations are finished the old nodes are shut down.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Object Information (2019-03-06 17:43:55 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace                Node      Total     Repl                           Objects                   Tombstones             Pending   Rack   
        .                   .    Records   Factor        (Master,Prole,Non-Replica)   (Master,Prole,Non-Replica)            Migrates     ID   
        .                   .          .        .                                 .                            .             (tx,rx)      .   
bar         0534b209248c:3000    3.301 M   2        (0.000,  0.000,  3.301 M)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar         172.17.0.10:3000     9.974 M   2        (4.991 M, 4.982 M, 0.000)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar         172.17.0.11:3000     9.940 M   2        (4.998 M, 4.941 M, 0.000)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar         172.17.0.12:3000    10.287 M   2        (5.111 M, 5.176 M, 0.000)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar         172.17.0.4:3000      3.558 M   2        (0.000,  0.000,  3.558 M)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar         172.17.0.5:3000      3.321 M   2        (0.000,  0.000,  3.321 M)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar         172.17.0.6:3000      3.271 M   2        (0.000,  0.000,  3.271 M)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar         172.17.0.7:3000      3.228 M   2        (0.000,  0.000,  3.228 M)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar         172.17.0.8:3000      3.524 M   2        (0.000,  0.000,  3.524 M)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar                             50.403 M            (15.100 M, 15.100 M, 20.203 M)    (0.000,  0.000,  0.000)      (0.000,  0.000)            
test        0534b209248c:3000    0.000     2        (0.000,  0.000,  0.000)           (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test        172.17.0.10:3000     0.000     2        (0.000,  0.000,  0.000)           (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test        172.17.0.11:3000     0.000     2        (0.000,  0.000,  0.000)           (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test        172.17.0.12:3000     0.000     2        (0.000,  0.000,  0.000)           (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test        172.17.0.4:3000      0.000     2        (0.000,  0.000,  0.000)           (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test        172.17.0.5:3000      0.000     2        (0.000,  0.000,  0.000)           (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test        172.17.0.6:3000      0.000     2        (0.000,  0.000,  0.000)           (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test        172.17.0.7:3000      0.000     2        (0.000,  0.000,  0.000)           (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test        172.17.0.8:3000      0.000     2        (0.000,  0.000,  0.000)           (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test                             0.000              (0.000,  0.000,  0.000)           (0.000,  0.000,  0.000)      (0.000,  0.000)            
Number of rows: 20

Admin>

The client output looks as follows:

2019-03-06 17:44:57.780 write(tps=6414 timeouts=0 errors=0) read(tps=6355 timeouts=0 errors=0) total(tps=12769 timeouts=0 errors=0)
2019-03-06 17:44:58.782 write(tps=6411 timeouts=0 errors=0) read(tps=6100 timeouts=0 errors=0) total(tps=12511 timeouts=0 errors=0)
2019-03-06 17:44:59.786 write(tps=6196 timeouts=0 errors=0) read(tps=6259 timeouts=0 errors=0) total(tps=12455 timeouts=0 errors=0)
2019-03-06 17:45:00.787 write(tps=5977 timeouts=0 errors=0) read(tps=6094 timeouts=0 errors=0) total(tps=12071 timeouts=0 errors=0)
2019-03-06 17:45:01.789 write(tps=6427 timeouts=0 errors=0) read(tps=6378 timeouts=0 errors=0) total(tps=12805 timeouts=0 errors=0)
2019-03-06 17:45:02.792 write(tps=5991 timeouts=0 errors=0) read(tps=6112 timeouts=0 errors=0) total(tps=12103 timeouts=0 errors=0)
2019-03-06 17:45:03.792 write(tps=6553 timeouts=0 errors=0) read(tps=6643 timeouts=0 errors=0) total(tps=13196 timeouts=0 errors=0)
2019-03-06 17:45:04.795 write(tps=6741 timeouts=0 errors=0) read(tps=6521 timeouts=0 errors=0) total(tps=13262 timeouts=0 errors=0)
2019-03-06 17:45:05.798 write(tps=6382 timeouts=0 errors=0) read(tps=6420 timeouts=0 errors=0) total(tps=12802 timeouts=0 errors=0)
2019-03-06 17:45:06.799 write(tps=4873 timeouts=0 errors=0) read(tps=4824 timeouts=0 errors=0) total(tps=9697 timeouts=0 errors=0)
2019-03-06 17:45:07.801 write(tps=3442 timeouts=0 errors=0) read(tps=3621 timeouts=0 errors=0) total(tps=7063 timeouts=0 errors=0)
2019-03-06 17:45:08.806 write(tps=2160 timeouts=0 errors=0) read(tps=2185 timeouts=0 errors=0) total(tps=4345 timeouts=0 errors=0)
2019-03-06 17:45:09.775 write(tps=1960 timeouts=0 errors=0) read(tps=1942 timeouts=0 errors=0) total(tps=3902 timeouts=0 errors=0)
2019-03-06 17:45:10.714 WARN Thread tend Node BB9030011AC4202 172.17.0.3 3000 refresh failed: Error -1: java.net.SocketTimeoutException: Read timed out
2019-03-06 17:45:10.776 write(tps=1644 timeouts=0 errors=0) read(tps=1692 timeouts=0 errors=0) total(tps=3336 timeouts=0 errors=0)
2019-03-06 17:45:11.777 write(tps=435 timeouts=0 errors=0) read(tps=417 timeouts=0 errors=0) total(tps=852 timeouts=0 errors=0)
2019-03-06 17:45:11.839 WARN Thread tend Node BB9040011AC4202 172.17.0.4 3000 refresh failed: Error -1: java.net.SocketTimeoutException: Read timed out
2019-03-06 17:45:12.779 write(tps=209 timeouts=0 errors=0) read(tps=211 timeouts=0 errors=0) total(tps=420 timeouts=0 errors=0)
2019-03-06 17:45:12.794 WARN Thread tend Node BB9050011AC4202 172.17.0.5 3000 refresh failed: com.aerospike.client.AerospikeException: Error -1: java.net.SocketException: Connection reset
	at com.aerospike.client.Info.sendCommand(Info.java:580)
	at com.aerospike.client.Info.<init>(Info.java:123)
	at com.aerospike.client.Info.request(Info.java:520)
	at com.aerospike.client.cluster.Node.refresh(Node.java:184)
	at com.aerospike.client.cluster.Cluster.tend(Cluster.java:444)
	at com.aerospike.client.cluster.Cluster.run(Cluster.java:406)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketException: Connection reset
	at java.net.SocketInputStream.read(SocketInputStream.java:210)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at com.aerospike.client.cluster.Connection.readFully(Connection.java:248)
	at com.aerospike.client.Info.sendCommand(Info.java:571)
	... 6 more

2019-03-06 17:45:13.048 WARN Thread tend Node BB9060011AC4202 172.17.0.6 3000 refresh failed: com.aerospike.client.AerospikeException: Error -1: java.net.SocketException: Connection reset
	at com.aerospike.client.Info.sendCommand(Info.java:580)
	at com.aerospike.client.Info.<init>(Info.java:85)
	at com.aerospike.client.cluster.PeerParser.<init>(PeerParser.java:43)
	at com.aerospike.client.cluster.Node.refreshPeers(Node.java:390)
	at com.aerospike.client.cluster.Cluster.tend(Cluster.java:453)
	at com.aerospike.client.cluster.Cluster.run(Cluster.java:406)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketException: Connection reset
	at java.net.SocketInputStream.read(SocketInputStream.java:210)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at com.aerospike.client.cluster.Connection.readFully(Connection.java:248)
	at com.aerospike.client.Info.sendCommand(Info.java:571)
	... 6 more

2019-03-06 17:45:13.659 WARN Thread tend Node BB9070011AC4202 172.17.0.7 3000 refresh failed: com.aerospike.client.AerospikeException: Error -1: java.net.SocketException: Connection reset
	at com.aerospike.client.Info.sendCommand(Info.java:580)
	at com.aerospike.client.Info.<init>(Info.java:85)
	at com.aerospike.client.cluster.PeerParser.<init>(PeerParser.java:43)
	at com.aerospike.client.cluster.Node.refreshPeers(Node.java:390)
	at com.aerospike.client.cluster.Cluster.tend(Cluster.java:453)
	at com.aerospike.client.cluster.Cluster.run(Cluster.java:406)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketException: Connection reset
	at java.net.SocketInputStream.read(SocketInputStream.java:210)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at com.aerospike.client.cluster.Connection.readFully(Connection.java:248)
	at com.aerospike.client.Info.sendCommand(Info.java:571)
	... 6 more

2019-03-06 17:45:13.783 write(tps=2303 timeouts=0 errors=0) read(tps=2344 timeouts=0 errors=0) total(tps=4647 timeouts=0 errors=0)
2019-03-06 17:45:14.021 WARN Thread tend Node BB9080011AC4202 172.17.0.8 3000 refresh failed: com.aerospike.client.AerospikeException: Error -1: java.net.SocketException: Connection reset
	at com.aerospike.client.Info.sendCommand(Info.java:580)
	at com.aerospike.client.Info.<init>(Info.java:85)
	at com.aerospike.client.cluster.PeerParser.<init>(PeerParser.java:43)
	at com.aerospike.client.cluster.Node.refreshPeers(Node.java:390)
	at com.aerospike.client.cluster.Cluster.tend(Cluster.java:453)
	at com.aerospike.client.cluster.Cluster.run(Cluster.java:406)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketException: Connection reset
	at java.net.SocketInputStream.read(SocketInputStream.java:210)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at com.aerospike.client.cluster.Connection.readFully(Connection.java:248)
	at com.aerospike.client.Info.sendCommand(Info.java:571)
	... 6 more

2019-03-06 17:45:14.783 write(tps=5749 timeouts=0 errors=0) read(tps=5695 timeouts=0 errors=0) total(tps=11444 timeouts=0 errors=0)
2019-03-06 17:45:15.029 WARN Thread tend Node BB9030011AC4202 172.17.0.3 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:15.029 WARN Thread tend Node BB9060011AC4202 172.17.0.6 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:15.030 WARN Thread tend Node BB9070011AC4202 172.17.0.7 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:15.030 WARN Thread tend Node BB9040011AC4202 172.17.0.4 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:15.031 WARN Thread tend Node BB9080011AC4202 172.17.0.8 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:15.031 WARN Thread tend Node BB9050011AC4202 172.17.0.5 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:15.784 write(tps=6578 timeouts=0 errors=0) read(tps=6546 timeouts=0 errors=0) total(tps=13124 timeouts=0 errors=0)
2019-03-06 17:45:16.036 WARN Thread tend Node BB9030011AC4202 172.17.0.3 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:16.036 WARN Thread tend Node BB9060011AC4202 172.17.0.6 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:16.038 WARN Thread tend Node BB9070011AC4202 172.17.0.7 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:16.038 WARN Thread tend Node BB9040011AC4202 172.17.0.4 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:16.039 WARN Thread tend Node BB9080011AC4202 172.17.0.8 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:16.039 WARN Thread tend Node BB9050011AC4202 172.17.0.5 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:16.046 INFO Thread tend Remove node BB9030011AC4202 172.17.0.3 3000
2019-03-06 17:45:16.046 INFO Thread tend Remove node BB9060011AC4202 172.17.0.6 3000
2019-03-06 17:45:16.046 INFO Thread tend Remove node BB9070011AC4202 172.17.0.7 3000
2019-03-06 17:45:16.046 INFO Thread tend Remove node BB9040011AC4202 172.17.0.4 3000
2019-03-06 17:45:16.046 INFO Thread tend Remove node BB9080011AC4202 172.17.0.8 3000
2019-03-06 17:45:16.046 INFO Thread tend Remove node BB9050011AC4202 172.17.0.5 3000
2019-03-06 17:45:16.785 write(tps=6694 timeouts=0 errors=0) read(tps=6496 timeouts=0 errors=0) total(tps=13190 timeouts=0 errors=0)
2019-03-06 17:45:17.786 write(tps=6801 timeouts=0 errors=0) read(tps=6707 timeouts=0 errors=0) total(tps=13508 timeouts=0 errors=0)
2019-03-06 17:45:18.787 write(tps=6734 timeouts=0 errors=0) read(tps=6647 timeouts=0 errors=0) total(tps=13381 timeouts=0 errors=0)
2019-03-06 17:45:19.787 write(tps=6184 timeouts=0 errors=0) read(tps=6252 timeouts=0 errors=0) total(tps=12436 timeouts=0 errors=0)
2019-03-06 17:45:20.788 write(tps=5742 timeouts=0 errors=0) read(tps=5712 timeouts=0 errors=0) total(tps=11454 timeouts=0 errors=0)

Initially, the output above may cause concern but it is all to be expected. When a client builds a partition map it does so by tending to all nodes in the cluster. These nodes report back which partitions they own. As visible above, even when quiesced, until a node has migrated out all of its partitions, it will retain the master role. For this reason, even after the node has been quiesced the clients will still tend to it until it shuts down. This allows for the possibility that the node be unquiesced. This behaviour is observed here:

2019-03-06 17:45:10.714 WARN Thread tend Node BB9030011AC4202 172.17.0.3 3000 refresh failed: Error -1: java.net.SocketTimeoutException: Read timed out
2019-03-06 17:45:10.776 write(tps=1644 timeouts=0 errors=0) read(tps=1692 timeouts=0 errors=0) total(tps=3336 timeouts=0 errors=0)
2019-03-06 17:45:11.777 write(tps=435 timeouts=0 errors=0) read(tps=417 timeouts=0 errors=0) total(tps=852 timeouts=0 errors=0)
2019-03-06 17:45:11.839 WARN Thread tend Node BB9040011AC4202 172.17.0.4 3000 refresh failed: Error -1: java.net.SocketTimeoutException: Read timed out

As the tend requests to the shutdown nodes are timing out, messages show that the workload continues without issue.

The connections used to tend are reset.

2019-03-06 17:45:13.783 write(tps=2303 timeouts=0 errors=0) read(tps=2344 timeouts=0 errors=0) total(tps=4647 timeouts=0 errors=0)
2019-03-06 17:45:14.021 WARN Thread tend Node BB9080011AC4202 172.17.0.8 3000 refresh failed: com.aerospike.client.AerospikeException: Error -1: java.net.SocketException: Connection reset
	at com.aerospike.client.Info.sendCommand(Info.java:580)
	at com.aerospike.client.Info.<init>(Info.java:85)
	at com.aerospike.client.cluster.PeerParser.<init>(PeerParser.java:43)
	at com.aerospike.client.cluster.Node.refreshPeers(Node.java:390)
	at com.aerospike.client.cluster.Cluster.tend(Cluster.java:453)
	at com.aerospike.client.cluster.Cluster.run(Cluster.java:406)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketException: Connection reset
	at java.net.SocketInputStream.read(SocketInputStream.java:210)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at com.aerospike.client.cluster.Connection.readFully(Connection.java:248)
	at com.aerospike.client.Info.sendCommand(Info.java:571)
	... 6 more

Further connections are refused, with messages showing the workload continuing:

2019-03-06 17:45:14.783 write(tps=5749 timeouts=0 errors=0) read(tps=5695 timeouts=0 errors=0) total(tps=11444 timeouts=0 errors=0)
2019-03-06 17:45:15.029 WARN Thread tend Node BB9030011AC4202 172.17.0.3 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:15.029 WARN Thread tend Node BB9060011AC4202 172.17.0.6 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:15.030 WARN Thread tend Node BB9070011AC4202 172.17.0.7 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:15.030 WARN Thread tend Node BB9040011AC4202 172.17.0.4 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:15.031 WARN Thread tend Node BB9080011AC4202 172.17.0.8 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:15.031 WARN Thread tend Node BB9050011AC4202 172.17.0.5 3000 refresh failed: Error -8: java.net.ConnectException: Connection refused (Connection refused)
2019-03-06 17:45:15.784 write(tps=6578 timeouts=0 errors=0) read(tps=6546 timeouts=0 errors=0) total(tps=13124 timeouts=0 errors=0)

The client removes the nodes from the partition map:

2019-03-06 17:45:14.783 write(tps=5749 timeouts=0 errors=0) read(tps=5695 timeouts=0 errors=0) total(tps=11444 timeouts=0 errors=0)
019-03-06 17:45:16.046 INFO Thread tend Remove node BB9030011AC4202 172.17.0.3 3000
2019-03-06 17:45:16.046 INFO Thread tend Remove node BB9060011AC4202 172.17.0.6 3000
2019-03-06 17:45:16.046 INFO Thread tend Remove node BB9070011AC4202 172.17.0.7 3000
2019-03-06 17:45:16.046 INFO Thread tend Remove node BB9040011AC4202 172.17.0.4 3000
2019-03-06 17:45:16.046 INFO Thread tend Remove node BB9080011AC4202 172.17.0.8 3000
2019-03-06 17:45:16.046 INFO Thread tend Remove node BB9050011AC4202 172.17.0.5 3000
2019-03-06 17:45:16.785 write(tps=6694 timeouts=0 errors=0) read(tps=6496 timeouts=0 errors=0) total(tps=13190 timeouts=0 errors=0)

Observe the final cluster state

The cluster is now shown as a stable 3 node cluster with significantly lower disk usage % due to the vertical expansion. Throughout the expansion there has been a steady workload which has not been disrupted. At all times the data replication has not lowered.

Namespaces
==========

   test
   ====
   1.   Devices            :  Total 0, per-node 0
   2.   Memory             :  Total 12.000 GB, 0.00% used (0.000 B), 100.00% available (12.000 GB)
   3.   Replication Factor :  2
   4.   Rack-aware         :  False
   5.   Master Objects     :  0.000  
   6.   Usage (Unique Data):  0.000 B  in-memory, 0.000 B  on-disk

   bar
   ===
   1.   Devices            :  Total 3, per-node 1
   2.   Memory             :  Total 12.000 GB, 22.33% used (2.680 GB), 77.67% available (9.320 GB)
   3.   Disk               :  Total 9.000 GB, 24.97% used (2.247 GB), 68.00% available contiguous space (6.120 GB)
   4.   Replication Factor :  2
   5.   Rack-aware         :  False
   6.   Master Objects     :  15.100 M
   7.   Usage (Unique Data):  0.000 B  in-memory, 547.218 MB on-disk

Admin> info
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Network Information (2019-03-08 16:15:03 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             Node               Node                 Ip       Build   Cluster   Migrations        Cluster     Cluster         Principal   Client     Uptime   
                .                 Id                  .           .      Size            .            Key   Integrity                 .    Conns          .   
172.17.0.10:3000    BB90A0011AC4202    172.17.0.10:3000   E-4.5.1.5         3      0.000     E9316F7FBC5B   True        BB90C0011AC4202        2   24:51:08   
172.17.0.12:3000    *BB90C0011AC4202   172.17.0.12:3000   E-4.5.1.5         3      0.000     E9316F7FBC5B   True        BB90C0011AC4202        3   24:51:08   
2de7168d260b:3000   BB90B0011AC4202    172.17.0.11:3000   E-4.5.1.5         3      0.000     E9316F7FBC5B   True        BB90C0011AC4202        2   24:51:08   
Number of rows: 3

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Usage Information (2019-03-08 16:15:03 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace                Node      Total   Expirations,Evictions     Stop         Disk    Disk     HWM   Avail%          Mem     Mem    HWM      Stop   
        .                   .    Records                       .   Writes         Used   Used%   Disk%        .         Used   Used%   Mem%   Writes%   
bar         172.17.0.10:3000     9.974 M   (0.000,  0.000)         false    759.900 MB   25      50      68       895.519 MB   22      60     90        
bar         172.17.0.12:3000    10.287 M   (0.000,  0.000)         false    783.761 MB   26      50      67       923.640 MB   23      60     90        
bar         2de7168d260b:3000    9.940 M   (0.000,  0.000)         false    757.334 MB   25      50      69       892.497 MB   22      60     90        
bar                             30.200 M   (0.000,  0.000)                    2.247 GB                              2.648 GB                            
test        172.17.0.10:3000     0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test        172.17.0.12:3000     0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test        2de7168d260b:3000    0.000     (0.000,  0.000)         false           N/E   N/E     50      N/E        0.000 B    0       60     90        
test                             0.000     (0.000,  0.000)                    0.000 B                               0.000 B                             
Number of rows: 8

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Object Information (2019-03-08 16:15:03 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace                Node      Total     Repl                         Objects                   Tombstones             Pending   Rack   
        .                   .    Records   Factor      (Master,Prole,Non-Replica)   (Master,Prole,Non-Replica)            Migrates     ID   
        .                   .          .        .                               .                            .             (tx,rx)      .   
bar         172.17.0.10:3000     9.974 M   2        (4.991 M, 4.982 M, 0.000)       (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar         172.17.0.12:3000    10.287 M   2        (5.111 M, 5.176 M, 0.000)       (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar         2de7168d260b:3000    9.940 M   2        (4.998 M, 4.941 M, 0.000)       (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
bar                             30.200 M            (15.100 M, 15.100 M, 0.000)     (0.000,  0.000,  0.000)      (0.000,  0.000)            
test        172.17.0.10:3000     0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test        172.17.0.12:3000     0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test        2de7168d260b:3000    0.000     2        (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (0.000,  0.000)     0      
test                             0.000              (0.000,  0.000,  0.000)         (0.000,  0.000,  0.000)      (0.000,  0.000)            
Number of rows: 8

Admin> 

Notes

  • In versions prior to Aerospike 4.3.1.3 the quiesce command is not present. In such a scenario a potential approach to expanding a cluster vertically may be to make use of rack aware to put all newer nodes into a single rack which would then mean that rack contained a single copy of every partition. The smaller nodes could then be shut down and the larger nodes could have their rack-id set to different values to cause migration and ensure that the replication factor was resumed. This would mean that for the duration of the exercise the replication factor would be reduced to 1.
  • Another area where this method could be employed to do an application transparent change would be when changing instance types or re-stacking cloud based Aerospike cluster nodes.

Keywords

EXPAND VERTICAL NODES QUIESCE CHANGE INSTANCE RACK_ID RACK AWARE INSTANCE TYPE AEROSPIKE

Timestamp

3/8/19