In the EXPLAIN feature, what does the STATUS value mean?


#1

Can someone clarify what the STATUS value means when using the EXPLAIN feature? I have seen value of 0 and value of 2 returning in some of my runs. What those values mean? are there any other values possible for it? Also, what are the other possible values of POLICY, under what conditions? From documentation it only shows “AS_POLICY_REPLICA_MASTER”. What are the possible values of KEY_POLICY? From documentation it only shows “AS_POLICY_KEY_SEND”

Thanks, -wilson


#2

Hi Wilson,

Let me go through each of these points in turn:

Can someone clarify what the STATUS value means when using the EXPLAIN feature? I have seen value of 0 and value of 2 returning in some of my runs. What do these values mean? are there any other values possible for it?

When explain runs, it executes a GET on the record in question. The STATUS value is the result of that get. So if I have a database with one record in, with a PK of 1 and explain that record, I will get:

aql> explain select * from test.testset where pk = 1 [ { “SET”: “testset”, “DIGEST”: “F5 91 24 98 6E 96 AD 17 5B 37 4C 94 87 94 5B BC AD 53 7B 74”, “NAMESPACE”: “test”, “PARTITION”: 501, “STATUS”: 0, “UDF”: “FALSE”, “TIMEOUT”: 1000, “NODE”: “BB9A9EAAB270008”, “POLICY”: “AS_POLICY_REPLICA_MASTER”, “KEY_POLICY”: “AS_POLICY_KEY_SEND” } ]

The status of 0 is AEROSPIKE_OK. If I try to explain a record that doesn’t exist, I will get a status of 2 (AEROSPIKE_ERR_RECORD_NOT_FOUND):

aql> explain select * from test.testset where pk = 10 [ { “SET”: “testset”, “DIGEST”: “E0 CE E9 56 07 A3 F8 B2 CE 5E 87 70 70 7F 61 87 37 CD F6 2E”, “NAMESPACE”: “test”, “PARTITION”: 3808, “STATUS”: 2, “UDF”: “FALSE”, “TIMEOUT”: 1000, “NODE”: “BB9A9EAAB270008”, “POLICY”: “AS_POLICY_REPLICA_MASTER”, “KEY_POLICY”: “AS_POLICY_KEY_SEND” } ]

A full set of statuses can be found in the client code (for example in the ResultCode class in the Java client).

Also, what are the other possible values of POLICY, under what conditions? From documentation it only shows “AS_POLICY_REPLICA_MASTER”.

This is the read policy. There are two possible values that explain prints today: AS_POLICY_REPLICA_MASTER and AS_POLICY_REPLICA_ANY. Default is AS_POLICY_REPLICA_MASTER. You can set aql to go do operation roundrobin on all the replica (in case it is read, write always goes to master) by doing:

aql> set replica_any true aql> explain select * from test.testset where pk = 1 [ { “SET”: “testset”, “DIGEST”: “F5 91 24 98 6E 96 AD 17 5B 37 4C 94 87 94 5B BC AD 53 7B 74”, “NAMESPACE”: “test”, “PARTITION”: 501, “STATUS”: 0, “UDF”: “FALSE”, “TIMEOUT”: 1000, “NODE”: “BB9A9EAAB270008”, “POLICY”: “AS_POLICY_REPLICA_ANY”, “KEY_POLICY”: “AS_POLICY_KEY_SEND” } ]

What are the possible values of KEY_POLICY? From documentation it only shows “AS_POLICY_KEY_SEND”

This is whether the key or the digest was sent. If sending the primary key, it shows AS_POLICY_KEY_SEND, if sending the digest it shows AS_POLICY_KEY_DIGEST:

aql> explain select * from test.testset where digest=‘F59124986E96AD175B374C9487945BBCAD537B74’ [ { “SET”: “testset”, “DIGEST”: “F5 91 24 98 6E 96 AD 17 5B 37 4C 94 87 94 5B BC AD 53 7B 74”, “NAMESPACE”: “test”, “PARTITION”: 501, “STATUS”: 0, “UDF”: “FALSE”, “TIMEOUT”: 1000, “NODE”: “BB9A9EAAB270008”, “POLICY”: “AS_POLICY_REPLICA_ANY”, “KEY_POLICY”: “AS_POLICY_KEY_DIGEST” } ]

Hope this helps,

Tim


#3

Thanks for the answers Tim, appreciate it. Question about this:

Does it mean that if I set that “set replica_any true” in aql, will I get answers from either master or replica? Is there a way to return the NodeId of where the replica lives? as I understand the NodeId in this answer only relates to the Master object, reason why it does not change no matter how many iterations of the same statement we send in aql


#4

Hi Wilson,

Yes, correct. If you invoke “set replica_any true”, your read will sometimes be executed against the master and sometimes against the replica. The node id that is printed is the one on which the read occurred, not necessarily the Master object.

Try placing multiple explains in a file such as test.aql as well as the “set replica_any true” and running this file with aql -f test.aql. It may be necessary to run this file multiple times in different processes.

Hope this helps,

Tim


#5

Thanks Tim. I will try this. In any case, is there a script or function we can use to determine where the replica of a given master object lives? just for troubleshooting purposes. In our evaluation of A^ we want to exercise different scenarios that for sure will trigger migrations and we want to track some sample data and see how master and replicas move around. Do you have a sample code or function that could determine where both pieces are based on a pk? Something simple like “aql select” only on the node where it is executed would suffice, this way we can query each node of the cluster until getting both success answers.


#6

Hi Wilson,

There isn’t anything built into Aerospike to do this, but given the client knows where the data lives, and the client is open source, it’s pretty easy to put something together. I’ve put a Java function down below which will print out node information and partition information. (Note that this is NOT production ready code!) You can get the partition id from the AQL explain command discussed above

This code runs fine on 3.6.2, but I’m not sure how it will run on older versions. I’ve got a hard-coded limit of 2 replicas in there but you can change this to your site’s actual replication factor. Sample output is:

*** Nodes ***
	Node BB9E7F1FD270008
		Address: /172.28.128.4:4000
		Host: 172.28.128.4:4000
	Node BB9633441270008
		Address: /172.28.128.5:4000
		Host: 172.28.128.5:4000
	Node BB92E78C6270008
		Address: /172.28.128.3:4000
		Host: 172.28.128.3:4000

*** Partitions ***
	0: BB9633441270008 BB9E7F1FD270008 
	1: BB9633441270008 BB92E78C6270008 
	2: BB9633441270008 BB9E7F1FD270008 
	3: BB92E78C6270008 BB9E7F1FD270008 
	4: BB92E78C6270008 BB9633441270008 
	5: BB9E7F1FD270008 BB92E78C6270008 
	6: BB9633441270008 BB92E78C6270008 
        ...

Hope this helps,

Tim

private void dumpClusterInfo() {
	try {
		// Assume a replication count of 2 for the sake of this exercise
		int REPLICATION_COUNT = 2;
		String[][] fullMap = new String[4096][REPLICATION_COUNT];
		
		System.out.println("*** Nodes ***");
		for (Node node : client.getNodes()) {
			System.out.println("\tNode " + node.getName());
			System.out.println("\t\tAddress: " + node.getAddress());
			System.out.println("\t\tHost: " + node.getHost());

			com.aerospike.client.cluster.Connection conn = node.getConnection(120);
			boolean requestProleReplicas = true;
			HashMap<String, AtomicReferenceArray<Node>[]> map = new HashMap<String,AtomicReferenceArray<Node>[]>();
			
			PartitionParser parser = new PartitionParser(conn, node, map, Node.PARTITIONS, requestProleReplicas);	
			HashMap<String,AtomicReferenceArray<Node>[]> partitionMap = parser.getPartitionMap();
			
			for (AtomicReferenceArray<Node>[] value : partitionMap.values()) {
				int j = 0;
				for (AtomicReferenceArray<Node> ref : value) {
					for (int i = 0; i < ref.length(); i++) {
						Node thisNode = ref.get(i);
						if (thisNode != null) {
							fullMap[i][j] = thisNode.getName();
						}
					}
					j++;
				}
			}
		}
		System.out.println("\n*** Partitions ***");
		for (int i = 0; i< 4096;i++) {
			String partNodes = "\t" + i + ": ";
			for (int j = 0; j < REPLICATION_COUNT; j++) {
				if (fullMap[i][j] == null) {
					partNodes += "? ";
				}
				else {
					partNodes += fullMap[i][j] + " ";
				}
			}
			System.out.println(partNodes);
		}

	} catch (Exception e) {
		// TODO Auto-generated catch block
		e.printStackTrace();
	}

}

#7

Thanks a lot Tim. I will give it a try.