Amazon ec2 unicast/mesh cluster problem


#1

Originally posted by srknc on Mon Aug 04, 2014 10:07 am

Hi,

I’m discovering aerospike cluster on amazon ec2 and having some trouble about durability. Because aws ec2 doesn’t support multicast protocol, i’ve configured instances using unicast(mest). As i understand from limited documentation, there are two methods to configure unicast

  • single seed configuration
  • ring seed configuration.

To avoid possible problem when removing first seeding node, i’ve decided to use ring seed method. Configuration below (network stanza)

[server1]
heartbeat {
mode mesh
port 3002
mesh-address [server2 address]
mesh-port 3002
interval 150
timeout 50
}

[server2]
heartbeat {
mode mesh
port 3002
mesh-address [server3 address]
mesh-port 3002
interval 150
timeout 50
}

[server3]
heartbeat {
mode mesh
port 3002
mesh-address [server4 address]
mesh-port 3002
interval 150
timeout 50
}

[server4]
heartbeat {
mode mesh
port 3002
mesh-address [server1 address]
mesh-port 3002
interval 150
timeout 50
}

In this configuration, I’ve started services one by one (starting from server1) and it looks work expectedly.

I’ve configured php driver and develop a basic code to create constant write request on server 3.

When i stop server2 aerospace service, server3 aerospike service (the php code running on) refused connections for ~5 sec. All servers detected new cluster structure (i saw CLUSTER SIZE = 3 line at the log file) and then server 3 started to accept connections. So i lose data for 5 sec.

I have decided to make another test and just restarted server4 aerspike service, server 2 aerospike log entires below;

Aug 04 2014 16:33:20 GMT: INFO (partition): (partition.c::2834) CLUSTER SIZE = 4
Aug 04 2014 16:33:35 GMT: INFO (paxos): (paxos.c::2598) SINGLE NODE CLUSTER!!!
Aug 04 2014 16:33:35 GMT: INFO (partition): (partition.c::2834) CLUSTER SIZE = 1
Aug 04 2014 16:33:49 GMT: INFO (partition): (partition.c::2834) CLUSTER SIZE = 2
Aug 04 2014 16:33:51 GMT: INFO (partition): (partition.c::2834) CLUSTER SIZE = 3

I didn’t check data consistency but i’m sure i’ve lost lots of records.

Now, there is no write or read request, i’ve restarted server4 asd 3 min ago, server 2 logs and error below;

WARNING (hb): (hb.c::1500) cf_socket_sendto() failed 2
INFO (cf:socket): (socket.c::176) sendto() failed: 11 Resource temporarily unavailable

To solve the problem i’ve restart the instance at the server 2, now they gives integrity error; Aug 04 2014 16:50:54 GMT: INFO (paxos): (paxos.c::2207) CLUSTER INTEGRITY FAULT.

Network usage during the problem.

server1; 21Mbps out, 52Mbps in
server2; 155Mbps out, 26 Mbps in
server3; 17Mbps out, 65 Mbps in
server4; 20Mbps out, 58 Mbps in

As a result, i was just testing durability using basic configuration and only thing that i did is restarting the services giving them time to repair cluster. In my opinion mesh (unicast) heartbeat method is not production ready. If i’m missing a point or do you have a suggestion to make cluster more durable, please give your opinions because i really would like to use product.

Thank you.


#2

Hi,

Looks like your heartbeat configuration is responsible for the 5+secs delay.

You currently have in the heartbeat stanza the following setting:

interval 150 timeout 50

Interval = Interval in milliseconds in which heartbeats are sent. Default is 100ms Timeout = Number of missing heartbeats after which the remote node will be declared dead. This setting is used to have some tolerance for momentary network glitches. Default is 10.

So in this case for server 3 to know that server 2 is down it will take : 150ms * 50 = 7.5 secs You may be able to tweak settings to obtain a case that will satisfy your requirements. Also the application being designed should take this timing into consideration.

Hope this helps.


#3

does this configuration works regardless of cluster size ? what about if we have more nodes lets say 50 …

what configuration u will prefer ? instance type is amazon ec2 i2.xlarge

does ram or disk usage or load on the boxes has any effect on this configuration


#4

It all depends on network reliability and how responsive one would like the cluster to detect a node being down vs. going through temporary network instabilities where one would prefer the cluster to stay formed.

The only thing the number of nodes would impact, especially in environments where network may be less reliable/robust, is the increase chances of network reliability issues.


#5

is there any way i can change the value of mesh heart beat time out dynamically ?? currently we are using default values.

asinfo -v ‘get-config:context=network’

this one is not working where as this command works

asinfo -v ‘get-config:context=service’


#6

The default for paxos-max-cluster-size is 32. Increase this when going beyond 32 nodes.


#7

Good point, important to change the paxos-max-cluster-size to 32.

Regarding heartbeat settings, the timeout / interval can be changed dynamically. Here is the context for those, it is actually network.heartbeat:

$ asinfo -v "get-config:context=network.heartbeat" -l 
heartbeat-mode=mesh
heartbeat-protocol=v2
heartbeat-address=172.31.31.64
heartbeat-port=3002
heartbeat-interval=150
heartbeat-timeout=20
$ asinfo -v "get-config:" -l  | grep heart
heartbeat-mode=mesh
heartbeat-protocol=v2
heartbeat-address=172.31.31.64
heartbeat-port=3002
heartbeat-interval=150
heartbeat-timeout=20