Aerospike routing traffic

Hello. I have 2 node cluster, and I need to make requests only on 1 of them. I set one ip in client configuration, but I can see that my client connects to the second node also. Is it possible that client somehow get node list from server or server somehow route traffic to another node ? Is it possible to make client make requests only on first node ?

The rack-aware feature of Aerospike Enterprise allows to to partition your cluster into multiple racks. With rack-aware enabled, you can then set the prefer-rack policy in the client to prefer a particular rack. This is often used in cloud environments to drastically reduce cross ‘availability zone’ latency and data-transfer costs.

Thank you for your answer. So do I understand correctly that aerospike get node list from server and even if I configure client only with one server, client will send data to all of the servers ? And there is no way to use only one server in cluster with community edition?

Yes, that is correct. What’s the reason you want to do this?

Because of replication lag, we have two applications - the first writes to aerospike, the second one reads from aerospike, pretty often when the second comes to read from another aerospike node it can’t find the key because of replication lag. Delay between read and write operations can be 0.1 - 0.5 sec

That shouldn’t be an issue by default. The default read policy will read from the master replica for a given partition, only if the master is unreachable will it try the replica. Writes must always use the master replica for a given partition.

There is something else going on with your environment. Could you grep the logs for ‘rebalanced’.

Also what version are the Aerospike Servers and what language and client version are you using?

Hm. It’s strange. Thank you for your help.

root@aerospike1 ~ # grep -ic rebalanced /var/log/aerospike/aerospike.log
root@aerospike1 ~ # head -1 /var/log/aerospike/aerospike.log
Apr 30 2019 04:27:44 GMT: INFO (info): (ticker.c:171) NODE-ID bb9d4512311b36c CLUSTER-SIZE 2

The aerospike server version is I’m using php and go client library.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Object Information (2019-07-23 18:03:07 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace                              Node       Total     Repl                           Objects                   Tombstones             Pending   Rack
        .                                 .     Records   Factor        (Master,Prole,Non-Replica)   (Master,Prole,Non-Replica)            Migrates     ID
        .                                 .           .        .                                 .                            .             (tx,rx)      .
ns1   675.382 M   2        (35.237 M, 640.145 M, 0.000)      (0.000,  0.000,  0.000)      (0.000,  0.000)     0
ns1   675.402 M   2        (640.144 M, 35.258 M, 0.000)      (0.000,  0.000,  0.000)      (0.000,  0.000)     0
ns1                                          1.351 B            (675.381 M, 675.403 M, 0.000)     (0.000,  0.000,  0.000)      (0.000,  0.000)


# asadm -e "show pmap"
Seed:        [('', 3000, None)]
Config_file: /root/.aerospike/astools.conf, /etc/aerospike/astools.conf
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Partition Map Analysis (2019-07-23 22:17:49 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Cluster   Namespace                              Node      Primary    Secondary         Dead   Unavailable
     Key           .                                 .   Partitions   Partitions   Partitions    Partitions
930430C64D1D   ns1         2078         2018            0             0
930430C64D1D   ns1         2018         2078            0             0

Maybe it means that aerospike2 now is master ? Is it possible to manually switch “master” for partitions ?

# asadm -e "asinfo -v 'replicas-all'"
Seed:        [('', 3000, None)]
Config_file: /root/.aerospike/astools.conf, /etc/aerospike/astools.conf ( returned:
ns1:2,7QBQtEaaS7jdftJCZ9bfP19mI4s90p5618HZe0ypMy4ndrY52zx5of0J98fl5y+aDNCTHp1liO4GC5SlcjCiFwN4c3HE22H9okAbW4+cp6ZglCjKeQC/DMer/q/MreEvmTgZAhQxGJboN1Xk3vJO8xvFsp93fhimTc88guKEc7cYB1V6RZ8zWCcCrdQd2PtWkkEgA7p/nRe/+TrvD49PQ2jsMsg4aapuJ8PLRs+d/JSCFxPilH2E790rZtd4Uniuyh3nfFgkesJqzWbvBxDn+05l7Skbm97zqf2WMt2Erycj71UDG7uW8BTcoH4moGa5SSfPGNwZXhehofPBCeK64m4SWGkJFW3NgzzHv+L27v+UG6qxcQAAZ5wkACsP26TPHgn1VpBBJKJ+Skp2IT6T8lhMwpyREDeiPVfL1/wjB0XcT9R9RXDiMyaXIu+ID/ue5s62z1XWYe/nE7Og/aPY+8t1Om7HkoWDRAfzJgnC0DVUae5RBk/XKpydW8sBkB7/qQvRNRpdK+9s9EIThQLs/+rvhYSaC6N6sPb+ZLktG2n6ImKwrsR4WEH1pxPJrcRbNXARUTgjeWD5YxRV9qUO6xF0gI6QuawApmddiGKtsAYsudq7xwve2g9uGpm3cNTA1YO9ODjJ/MXQrnDdLdPVlcHjeskNwKAMQEXGAZuLYx4=,Ev+vS7lltEcigS29mCkgwKCZ3HTCLWGFKD4mhLNWzNHYiUnGJMOGXgL2CDgaGNBl8y9s4WKadxH59Gtajc9d6PyHjI47JJ4CXb/kpHBjWFmfa9c1hv9A8zhUAVAzUh7QZsfm/evO52kXyKobIQ2xDOQ6TWCIgedZsjDDfR17jEjn+KqFumDMp9j9UiviJwSpbb7f/EWAYuhABsUQ8HCwvJcTzTfHllWR2Dw0uTBiA2t96Owda4J7ECLUmSiHrYdRNeIYg6fbhT2VMpkQ+O8YBLGaEtbkZCEMVgJpzSJ7UNjcEKr85ERpD+sjX4HZX5lGttgw5yPmoeheXgw+9h1FHZHtp5b26pIyfMM4QB0JEQBr5FVOjv//mGPb/9TwJFsw4fYKqW++212BtbWJ3sFsDaezPWNu78hdwqg0KAPc+LojsCuCuo8dzNlo3RB38ARhGTFJMKopnhAY7ExfAlwnBDSKxZE4bXp8u/gM2fY9L8qrlhGu+bAo1WNipDT+b+EAVvQuyuWi1BCTC73sev0TABUQentl9FyFTwkBm0bS5JYF3Z1PUTuHp74KWOw2Ujukyo/ursfchp8GnOuqCVrxFO6Lf3FvRlP/WZiid51ST/nTRiVEOPQhJfCR5WZIjys/KnxCx8c2AzovUY8i0iwqaj4chTbyP1/zv7o5/mR0nOE=;ns2:1,//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////8= (yy.yy.yy.yy) returned:

Master are selected using a deterministic algorithm based on current cluster membership for each partition.

You can see from the output you have provided, has 2078 master (aka primary) partitions and 2018 replica (aka secondary) partitions.

The cluster seems stable since there hasn’t been a cluster disruption since the logs you grepped began. Are you sure the requests for the records are arriving after a successful write? Is it possible that a request arrives before the data was ever written or while the transaction is in progress?

What I find odd is the huge disparity between distribution of total # of records masters vs replicas - for e.g. node 1 - #masters (35M) and #replicas (640M). I would check the config file of each node (/etc/aerospike/aerospike.conf), are the storage-engine size allocations same on both nodes? high-water-marks, grep for eviction in the logs …

Ah, good point @pgupta. The older eviction algorithm could cause these imbalances in 2 nodes clusters.

@Ivan44785372, to confirm, could you provide the output for:

asadm -e "info"

We would expect to see the vast majority of evictions taking place on the node with fewer master objects (aerospike1).

If this is the case you have two options:

  1. Adding a third node will allow the eviction algorithm normalize.
  2. Upgrade to Aerospike or later - Aerospike addresses this issue with:

    [AER-6000] - (KVS) Redesigned namespace supervisor (nsup), featuring expiration and eviction without transactions, and per-namespace control.

    # asadm -e info
    Seed:        [('', 3000, None)]
    Config_file: /root/.aerospike/astools.conf, /etc/aerospike/astools.conf
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Network Information (2019-07-24 05:00:29 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                               Node               Node                   Ip       Build   Cluster   Migrations        Cluster     Cluster         Principal   Client       Uptime
                                  .                 Id                    .           .      Size            .            Key   Integrity                 .    Conns            .   *BB9D4512311B36C   xx.xx.xx.xx:3000   C-         2      0.000     930430C64D1D   True        BB9D4512311B36C     5793   3335:45:16   BB9BC512311B36C    yy.yy.yy.yy:3000    C-         2      0.000     930430C64D1D   True        BB9D4512311B36C     2717   2557:45:49
    Number of rows: 2

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Usage Information (2019-07-24 05:00:29 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Namespace                              Node       Total    Expirations,Evictions     Stop       Disk    Disk     HWM   Avail%          Mem     Mem    HWM      Stop
            .                                 .     Records                        .   Writes       Used   Used%   Disk%        .         Used   Used%   Mem%   Writes%
    ns1   654.490 M   (5.265 B, 15.957 M)      false    1.173 TB   30      80      55       108.174 GB   57      94     96
    ns1   654.510 M   (356.946 M, 0.000)       false    1.173 TB   21      80      69       107.878 GB   34      90     90
    ns1                                        1.309 B   (5.622 B, 15.957 M)               2.345 TB                            216.052 GB
    ns2     6.607 M   (146.546 M, 86.522 M)    false         N/E   N/E     50      N/E        2.243 GB   38      96     98
    ns2                                         6.607 M   (146.546 M, 86.522 M)             0.000 B                               2.243 GB
    ns3     0.000     (0.000,  0.000)          false         N/E   N/E     50      N/E        0.000 B    0       60     90
    ns3                                        0.000     (0.000,  0.000)                   0.000 B                               0.000 B
    Number of rows: 7

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Object Information (2019-07-24 05:00:29 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Namespace                              Node       Total     Repl                           Objects                   Tombstones             Pending   Rack
            .                                 .     Records   Factor        (Master,Prole,Non-Replica)   (Master,Prole,Non-Replica)            Migrates     ID
            .                                 .           .        .                                 .                            .             (tx,rx)      .
    ns1   654.490 M   2        (35.136 M, 619.354 M, 0.000)      (0.000,  0.000,  0.000)      (0.000,  0.000)     0
    ns1   654.510 M   2        (619.353 M, 35.157 M, 0.000)      (0.000,  0.000,  0.000)      (0.000,  0.000)     0
    ns1                                        1.309 B            (654.489 M, 654.511 M, 0.000)     (0.000,  0.000,  0.000)      (0.000,  0.000)
    ns2     6.608 M   1        (6.608 M, 0.000,  0.000)          (0.000,  0.000,  0.000)      (0.000,  0.000)     0
    ns2                                         6.608 M            (6.608 M, 0.000,  0.000)          (0.000,  0.000,  0.000)      (0.000,  0.000)
    ns3     0.000     1        (0.000,  0.000,  0.000)           (0.000,  0.000,  0.000)      (0.000,  0.000)     0
    ns3                                        0.000              (0.000,  0.000,  0.000)           (0.000,  0.000,  0.000)      (0.000,  0.000)
    Number of rows: 7

~ Configuration is a liitle bit different between nodes: aerospike1 has less disk space and memory-size than aerospike2 but as I can see aerospike1 doesn’t have stop writes flag and it has enough disk and memory space now.

This will make this type of issue worse. You should have homogeneous hardware within you cluster if possible. Alternatively, you should configure all nodes to logically have the same amount memory and disk resources.

These are the expirations, evictions from aerospike1 - notice that this node has evicted 15.957 million records.

While aerospike2 has evicted 0 records. This is causing a master record imbalance between these nodes. Because this node is full of replica objects, it has no room for new master objects. What you are seeing from the client is that a write succeeds and is immediately evicted from aerospike1 - subsequent reads will not find it since it has been removed.

You need to run homogeneous hardware if possible or make them to be homogeneous through configuration and upgrade to the at least to resolve this issue.

Another suggestion, if you download CE or later version, register with the portal with your email etc. You will then get in your email, a link to Aerospike Academy with login creds - which will allow you to take a Free Online Intro course, few hours long. It will give you the necessary background to understand all these configuration issues better.

Thank you very much. BTW is it possible to have replication between and ?

Yes, the problem is eviction - not replication.

Upgrading Aerospike1 first since it is the unhealthy node.