Inconsistency in aero: throw a warning when client isn't able to reach all the nodes in a cluster


#1

Hello again,

Currenty we fighting with a new scan bug. It seems somehow scan, amc and the java client gets out of sync. To be honest, i dont have any clue what we did to create this issue and were not able to reproduce the steps to create the issue. Anyways the issue seems to be persistent with our current setup, so maybe the devs are able to reproduce it a bit better / got a clue whats up.

We using scans more or less successfull in our pre production enviroment. We got them running and know how to handle them. Anyways under somehow not reproducle circumstances (at least we cant provide an example code, as we normally did) it happend that aero is in a not consistent state.

The issue: We implemented our own hashmap in java to save key values persistent in aero. All key value entrys are written to Namespace “maps”, with prefix “AeroMap_” plus an suffix for each key (single-bin). To iterate over all keys of our hashmap we doing a scan job.

So in our current env we got following entrys in aero:

AeroMap_a -> 1

If we now do a Scanjob for this Set, aero doesnt return records.

Instead if we ask aero to read “AeroMap_a” we get as expected 1 back.

If we add addtional values and got a set like this:

AeroMap_a -> 1 
AeroMap_b -> 2 
AeroMap_c -> 3

Aero returns via scan record B and C. Record A is still missing! even if AMC is saying 3 records for this set. Somehow the interesting part is, if we shutdown one node, the scanjob gives us the full set of keys.

Our Setup: 2 Nodes (3.5.9 Enterprise), AMC latest

Aerospike.conf for maps:

namespace maps {
        replication-factor 2
        memory-size 1G
        default-ttl 0
        single-bin true

        storage-engine device {
                file /opt/aerospike/data/maps.dat
                filesize 10G
                cold-start-empty true
        } 
}

Also we did a backup of via asbackup: here

And the binary (maps.dat) for both nodes 1 and 2


#2

Sorry for the late reply, this is definitely an unexpected situation. I will re-raise this issue locally.


#3

We figured out that the reason for this is how we tunnel Aerospike client connections (ssh). When connected to only one Node, the scan only returns the master records of that node, not the replicated ones. That is the reason why we got the full set if we take one node offline. When connecting to all nodes, the entire set is retrieved. This is undocumented behavior, maybe a bug, but definitely confusing and unexpected.

It should be possible to only connect to a local Aerospike node and retrieve the full set via a scan (not just the master records) for extra speed (as the network would not be bothered for Consistency.ONE).


#4

Seems your client is not able to reach all the nodes of the cluster because of the way you tunnel the connections. We expect that the clients/applications can connect to any node in the cluster. Given a seed node, our client driver discovers the rest of the nodes in the cluster. That is why It is important that the client is able to reach all the nodes of the cluster.

Some operations like the KVS operations (get/put) can be proxied from one node to its master. But operations like scan, secondary index queries, and aggregations needs to run on all the nodes of the cluster and return data from those nodes.


#5

Exactly, thats what i was talking about ! So the fault was definitly on our site. Anyways it would be great if the java client throws a warning or something like “wasnt able to reach all the nodes in the cluster” instead of just go ahead and behave like theres no problem. If there would be a warning, we would have noticed that the issue was on our side.


#6

Agree with your suggestion for warning in such scenarios. We are discussing internally. Please standby.


#7

Please make sure the logging subsystem is turned on at INFO level. Node connectivity failures will be logged, such as: Log.info("Node " + node + " refresh failed: " + Util.getErrorMessage(e)) Log.info("Add node " + node); Log.info("Remove node " + node);

Thanks.


#8