Cluster disbalance on node adding


#1

I have a cluster of three nodes that use Aerospike 3.6.0. Last week I was upgrading memory on each node. I have a process like following:

  1. Stop aerospike on one node
  2. Speed migration process up for other nodes
  3. As soon as migration process has finished I start aerospike service
  4. Speed migration process down using default settings

When service has started cluster has 1-2 minutes downtime. This downtime was repeated when I was upgrading each node.

Here are namespace infos:

Before maintanence node is going to start

[ 2016-07-20 09:56:57 'info namespace' sleep: 5.0s iteration: 2928 ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Information~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          Node   Namespace   Avail%   Evictions      Master     Replica     Repl     Stop         Disk    Disk     HWM         Mem     Mem    HWM      Stop
             .           .        .           .     Objects     Objects   Factor   Writes         Used   Used%   Disk%        Used   Used%   Mem%   Writes%
<node ip addr>   N/E            N/E         N/E         N/E         N/E      N/E   N/E             N/E     N/E     N/E         N/E     N/E    N/E       N/E
v-5              ssd0            24   146319233   682.371 M   726.184 M        2   false    244.619 GB      66      50   83.956 GB      58     60        90
v-6              ssd0            26    56304299   726.151 M   682.376 M        2   false    244.615 GB      66      50   83.955 GB      54     60        90

when node has started

[ 2016-07-20 09:57:02 'info namespace' sleep: 5.0s iteration: 2929 ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Information~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          Node   Namespace   Avail%   Evictions      Master     Replica     Repl     Stop         Disk    Disk     HWM         Mem     Mem    HWM      Stop
             .           .        .           .     Objects     Objects   Factor   Writes         Used   Used%   Disk%        Used   Used%   Mem%   Writes%
<node ip addr>   ssd0            29           0   430.650 M     0.000          2   false    118.855 GB      32      50   40.712 GB      27     60        90
v-5              ssd0            24   146319233   682.360 M   726.184 M        2   false    244.617 GB      66      50   83.956 GB      58     60        90
v-6              ssd0            26    56304299   726.151 M   682.364 M        2   false    244.613 GB      66      50   83.954 GB      54     60        90

when downtime has started

[ 2016-07-20 09:57:08 'info namespace' sleep: 5.0s iteration: 2930 ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Information~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Node   Namespace   Avail%   Evictions      Master   Replica     Repl     Stop         Disk    Disk     HWM         Mem     Mem    HWM      Stop
       .           .        .           .     Objects   Objects   Factor   Writes         Used   Used%   Disk%        Used   Used%   Mem%   Writes%
v-5        ssd0           N/E         N/E         N/E       N/E      N/E   N/E             N/E     N/E     N/E         N/E     N/E    N/E       N/E
v-6-07-2   ssd0           N/E         N/E         N/E       N/E      N/E   N/E             N/E     N/E     N/E         N/E     N/E   0m N/E       N/E
92mv-6-07-4   ssd0            29           0   430.650 M   0.000          2   false    118.855 GB      32      50   40.712 GB      27     60        90

during downtime

[ 2016-07-20 09:57:16 'info namespace' sleep: 5.0s iteration: 2931 ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Information~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Node   Namespace   Avail%   Evictions      Master   Replica     Repl     Stop         Disk    Disk     HWM         Mem     Mem    HWM      Stop
       .           .        .           .     Objects   Objects   Factor   Writes         Used   Used%   Disk%        Used   Used%   Mem%   Writes%
v-5        N/E            N/E         N/E         N/E       N/E      N/E   N/E             N/E     N/E     N/E         N/E     N/E    N/E       N/E
v-6-07-2   N/E            N/E         N/E         N/E       N/E      N/E   N/E             N/E     N/E     N/E         N/E     N/E    N/E       N/E
v-6-07-4   ssd0            29           0   430.650 M   0.000          2   false    118.855 GB      32      50   40.712 GB      27     60        90

when the cluster has been restored

[ 2016-07-20 09:59:39 'info namespace' sleep: 5.0s iteration: 2949 ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Information~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Node   Namespace   Avail%   Evictions      Master     Replica     Repl     Stop         Disk    Disk     HWM         Mem     Mem    HWM      Stop
       .           .        .           .     Objects     Objects   Factor   Writes         Used   Used%   Disk%        Used   Used%   Mem%   Writes%
v-5        ssd0            28   146319233   463.015 M   245.189 M        2   false    161.356 GB      44      50   55.234 GB      38     60        90
v-6-07-2   ssd0            27    56304299   492.549 M   222.326 M        2   false    164.149 GB      45      50   56.534 GB      37     60        90
v-6-07-4   ssd0            31           0   430.658 M     0.000          2   false    118.857 GB      32      50   40.713 GB      27     60        90

Here are service infos:

Before:

[ 2016-07-20 09:56:59 'info service' sleep: 5.0s iteration: 2897 ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Service Information~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          Node   Build   Cluster      Cluster     Cluster    Free   Free   Migrates   Principal   Objects      Uptime
             .       .      Size   Visibility   Integrity   Disk%   Mem%          .           .         .           .
<node ip addr>   N/E         N/E   N/E          N/E           N/E    N/E   N/E        N/E             N/E   N/E
v-5              3.6.0         2   True         True           34     42   (0,0)      v-5         1.432 G   145:58:53
v-6              3.6.0         2   True         True           34     45   (0,0)      v-5         1.432 G   49:04:24
Number of rows: 3

when node has started

[ 2016-07-20 09:57:04 'info service' sleep: 5.0s iteration: 2898 ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Service Information~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          Node   Build   Cluster      Cluster     Cluster    Free   Free   Migrates         Principal     Objects      Uptime
             .       .      Size   Visibility   Integrity   Disk%   Mem%          .                 .           .           .
<node ip addr>   3.6.0         3   False        True           67     73   (2798,0)   BB9C5910842A844   701.995 M   01:12:39
v-5              3.6.0         3   True         True           35     42   (0,0)      BB9C5910842A844     1.424 G   145:58:58
v-6              3.6.0         3   True         True           35     46   (0,0)      BB9C5910842A84

during downtime

[ 2016-07-20 09:57:15 'info service' sleep: 5.0s iteration: 2899 ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Service Information~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Node   Build   Cluster      Cluster     Cluster    Free   Free   Migrates   Principal     Objects     Uptime
       .       .      Size   Visibility   Integrity   Disk%   Mem%          .           .           .          .
v-5        N/E         N/E   N/E          N/E           N/E    N/E   N/E        N/E               N/E   N/E
v-6-07-2   N/E         N/E   N/E          N/E           N/E    N/E   N/E        N/E               N/E   N/E
mv-6-07-4   3.6.0         3   True         True           67     73   (2808,2)   v-6-07-4    701.995 M   01:12:45
Number of rows: 3

    46   (0,0)      BB9C5910842A844     1.425 G   49:04:29
Number of rows: 3

after restore

[ 2016-07-20 09:59:49 'info service' sleep: 5.0s iteration: 2912 ]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Service Information~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Node   Build   Cluster      Cluster     Cluster    Free   Free   Migrates   Principal     Objects      Uptime
       .       .      Size   Visibility   Integrity   Disk%   Mem%          .           .           .           .
v-5        3.6.0         3   False        True           57     61   (1208,1)   v-6-07-4    942.102 M   146:01:43
v-6-07-2   3.6.0         3   False        True           56     63   (1160,0)   v-6-07-4    964.572 M   49:07:15
v-6-07-4   3.6.0         3   True         True           67     73   (3123,2)   v-6-07-4    702.005 M   01:15:24
Number of rows: 3

      BB9C5910842A844     1.425 G   49:04:29

My questions are:

  1. What’s going wrong with my cluster?
  2. What should I do to increase cluster availablilty higher when adding nodes?

#2

Object counts during migrations is a common source of confusion. Basically the counts are underestimates during migration. Partitions that haven’t reached their final state are not counted.

I would recommend upgrading to the latest version, there were several issues related to rebalance resolved since 3.6.0.