The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.
Node Not Found For Partition using AQL with Strong Consistency
Problem Description
When a namespace has been configured to be strongly consistent, a test insert into the namespace fails while an insert from the same client for an AP namespace works correctly.
Admin> show config like strong-consistency
~~~~~~~~~~~~~~~~~~~~~~~~~~~test Namespace Configuration (2020-03-31 17:41:24 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : 10342564f2bd:3000 172.17.0.4:3000 172.17.0.5:3000 172.17.0.6:3000
strong-consistency : false false false false
strong-consistency-allow-expunge: false false false false
~~~~~~~~~~~~~~~~~~~~~~~~~~~bar Namespace Configuration (2020-03-31 17:41:24 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~
NODE : 10342564f2bd:3000 172.17.0.4:3000 172.17.0.5:3000 172.17.0.6:3000
strong-consistency : true true true true
strong-consistency-allow-expunge: false false false false
Admin>
...
root@10342564f2bd:/# aql
Seed: 127.0.0.1
User: None
Config File: /etc/aerospike/astools.conf /root/.aerospike/astools.conf
Aerospike Query Client
Version 3.23.0
C Client Version 4.6.9
Copyright 2012-2019 Aerospike. All rights reserved.
aql> insert into test.testset (PK,value1) values(1,'value1')
OK, 1 record affected.
aql> insert into bar.testset (PK,value1) values(1,'value1')
Error: (-8) Node not found for partition bar:501
aql>
Explanation
This issue will occur when the roster has not been set for the strongly consistent namespace. If the roster is not set, the cluster cannot know which nodes to distribute the namespace data across and therefore partitions cannot be assigned. Using the example cluster above the partition map can be displayed and it can be seen that there are no partitions mapped to nodes for the strongly consistent namespace (bar).
Admin> show pmap
~~~~~~~~~~~~~~~~~~~~~~~~~Partition Map Analysis (2020-03-31 17:43:04 UTC)~~~~~~~~~~~~~~~~~~~~~~~~
Cluster Namespace Node Primary Secondary Dead Unavailable
Key . . Partitions Partitions Partitions Partitions
179FDC193C39 bar 10342564f2bd:3000 0 0 0 0
179FDC193C39 bar 172.17.0.5:3000 0 0 0 0
179FDC193C39 bar 172.17.0.4:3000 0 0 0 0
179FDC193C39 bar 172.17.0.6:3000 0 0 0 0
179FDC193C39 test 10342564f2bd:3000 1024 1024 0 0
179FDC193C39 test 172.17.0.5:3000 1024 1024 0 0
179FDC193C39 test 172.17.0.4:3000 1024 1024 0 0
179FDC193C39 test 172.17.0.6:3000 1024 1024 0 0
Number of rows: 8
Admin>
The roster defines the cluster in its normal state in terms of node membership. Without this node list it is not possible to create a partition map as, unlike AP mode, the cluster will not simply map partitions to all the nodes it can see. If the cluster were to do this then consistency in the face of a network partition could not be assured. For this reason, the roster is key. To validate that the roster is the issue, it can be checked using the roster asinfo
command.
Admin> asinfo -v 'roster:namespace=bar'
10342564f2bd:3000 (172.17.0.3) returned:
roster=null:pending_roster=null:observed_nodes=BB9060011AC4202,BB9050011AC4202,BB9040011AC4202,BB9030011AC4202
172.17.0.5:3000 (172.17.0.5) returned:
roster=null:pending_roster=null:observed_nodes=BB9060011AC4202,BB9050011AC4202,BB9040011AC4202,BB9030011AC4202
172.17.0.4:3000 (172.17.0.4) returned:
roster=null:pending_roster=null:observed_nodes=BB9060011AC4202,BB9050011AC4202,BB9040011AC4202,BB9030011AC4202
172.17.0.6:3000 (172.17.0.6) returned:
roster=null:pending_roster=null:observed_nodes=BB9060011AC4202,BB9050011AC4202,BB9040011AC4202,BB9030011AC4202
Admin>
The output above confirms that while all cluster nodes are visible or observed, none are present in the roster.
Solution
To resolve this issue the roster should be set. This is done using the roster-set
info command followed by the recluster
info command. Only the principal node is expected to respond to the recluster
command and other nodes will ignore it.
Admin> asinfo -v 'roster-set:namespace=bar;nodes=BB9060011AC4202,BB9050011AC4202,BB9040011AC4202,BB9030011AC4202'
10342564f2bd:3000 (172.17.0.3) returned:
ok
172.17.0.5:3000 (172.17.0.5) returned:
ok
172.17.0.4:3000 (172.17.0.4) returned:
ok
172.17.0.6:3000 (172.17.0.6) returned:
ok
Admin> asinfo -v 'recluster:namespace=bar'
10342564f2bd:3000 (172.17.0.3) returned:
ignored-by-non-principal
172.17.0.5:3000 (172.17.0.5) returned:
ignored-by-non-principal
172.17.0.4:3000 (172.17.0.4) returned:
ignored-by-non-principal
172.17.0.6:3000 (172.17.0.6) returned:
ok
Admin>
Partitions for namespace bar
are now mapped across the 4 nodes in the roster as expected:
Admin> show pmap
~~~~~~~~~~~~~~~~~~~~~~~~~Partition Map Analysis (2020-03-31 17:52:36 UTC)~~~~~~~~~~~~~~~~~~~~~~~~
Cluster Namespace Node Primary Secondary Dead Unavailable
Key . . Partitions Partitions Partitions Partitions
B84A449E41CF bar 10342564f2bd:3000 1024 1024 0 0
B84A449E41CF bar 172.17.0.5:3000 1024 1024 0 0
B84A449E41CF bar 172.17.0.4:3000 1024 1024 0 0
B84A449E41CF bar 172.17.0.6:3000 1024 1024 0 0
The AQL command now completes properly.
aql> insert into bar.testset (PK,value1) values(1,'value1')
OK, 1 record affected.
Notes
- The Node Not Found For Partition could also indicate a potential tending error.
- It is not mandatory to include all observed nodes in the cluster in the roster however if this is not done the reasoning should be well understood.
Keywords
NODE NOT FOUND FOR PARTITION AQL CLIENT STRONG CONSISTENCY ROSTER
Timestamp
March 2020