Problems Configuring Clustering on AWS EC2 with 3 DB Instances


#1

I am encountering some problems when configuring my database cluster… I have 3 16 core instances with 4 NIC each. They are clustered in a ring network with node 1 -> node 2 -> node 3 -> node 1

Below are the configurations:

# [server 1]
heartbeat {
    mode mesh
    port 3002 # Heartbeat port for this node.

    # List one or more other nodes, one ip-address & port per line:
    mesh-seed-address-port 172.31.59.220 3002
    interval 250
    timeout 20
    }

# [server 2]
heartbeat {
    mode mesh
    port 3002 # Heartbeat port for this node.

    # List one or more other nodes, one ip-address & port per line:
    mesh-seed-address-port 172.31.59.230 3002
    interval 250
    timeout 20
    }

# [server 3]
heartbeat {
    mode mesh
    port 3002 # Heartbeat port for this node.

    # List one or more other nodes, one ip-address & port per line:
    mesh-seed-address-port 172.31.59.210 3002
    interval 250
    timeout 20
    }

I configured the each server with the network optimizations as suggested in http://www.aerospike.com/docs/deploy_guides/aws/tune/

Next, I started each server from node 1 -> node 2 -> node 3. I am unable to see a cluster of 3 nodes created when monitoring Aerospike using asmonitor info command.

I would like to know if I have configured anything incorrectly…

The network cards are and their association are listed below:

[node 1] 172.31.59.210-217 (8 cards)
[node 2] 172.31.59.220-223 (4 cards)
[node 3] 172.31.59.230-233 (4 cards)

Lastly, are there any special configurations that needs to be made on the clients (querying) to scale out the reads? My intention is to scale the reads by expanding the number of nodes in the cluster. When clustered, are clients free to make a query through any of the interface cards (16 interfaces in my case) associated to Aerospike? How is scaling of reads done by Aerospike.

Any help or explanation is greatly appreciated!


#2

Hi Lzuwei,

For the point about the nodes that were not forming a cluster, I’m sure you have already done this, but would you verify that IPtables are disabled on all nodes in the cluster?

http://www.aerospike.com/docs/operations/troubleshoot/cluster/#iptable-configuration

I only mention it because IPtables have caused problems for me in the past.

I would recommend that each node in the cluster contains mesh-seed-address-port entries for each other node in the cluster: Node 1 knows about Nodes 2 and 3. Node 2 knows about Nodes 1 and 3. Node 3 knows about Nodes 1 and 2. This way, if Node 1 is starting, and Node 2 is down, Node 1 can find Node 3.

Are there any special client configurations required for querying clients? Provided that they can reach the nodes in the cluster, there are not any special required client configurations.

Each client connects to the cluster. Each client uses Aerospike Smart Partitions to maintain a local list of each of the 4095 partitions in the cluster. When the client reads data from the cluster, it consults the partition table to find out where that record resides. The client then connects to that node to perform the query.

Aerospike does not have a preference for the NIC that the client uses to connect to the cluster. I would recommend, however, that each client know about each node in the cluster. If the client only has one address, it will connect to that one node and then learn about the other nodes in the cluster. But if that client tries to connect to a node that is not available, it does not have another node where it can connect.

I’m sure you have already seen this, but just in case, the following document provides an overview of the client architecture:

http://www.aerospike.com/docs/architecture/clients.html

Would you let me know if this resolves your questions?

Thank you for your time, and I hope this helps,

-DM


#3

Hi Dave,

Many thanks for the detailed explanation!

I managed to get the cluster up but I had to make a few modifications to the configuration above. In general, I configured 2 things under the service and heartbeat stanza.

I added access-address in the service stanza to indicate the access address exposed to clients, and under the heartbeat stanza, I add address to indicate the address to broadcast the heartbeat. After making the following configurations and restarting the service, I was able to form the cluster.

I will post my updated configuration below in case anyone wants to refer to it

# [server 1]
service {
    address any                               # IP of the NIC on which the service is # listening.
    port 3000                                   # port on which the service is listening.
    access-address 172.31.59.210 # IP address exported to clients that access
                                                       # the service.
  }

heartbeat {
    mode mesh
    address 172.31.59.210  # IP of the NIC on which this node is listening to heartbeat
    port 3002                       # Heartbeat port for this node.

    # List one or more other nodes, one ip-address & port per line:
    mesh-seed-address-port 172.31.59.220 3002 #next seed node in the cluster
    interval 250
    timeout 20
    }

# similar configuration for the other nodes