Hello, I’m trying to set up clustering (multiple nodes) in order to ensure more resiliency and reliability (replication).
Beyond the clustering docs I have found, I can’t seem to find any step by step guide to setting up multiple nodes.
Thanks
Hello, I’m trying to set up clustering (multiple nodes) in order to ensure more resiliency and reliability (replication).
Beyond the clustering docs I have found, I can’t seem to find any step by step guide to setting up multiple nodes.
Thanks
Hi Nicholas,
Glad to hear that your trying to setup multiple nodes. But to answer your question, It depends if your using Multicast or Mesh?
Multicast Example
All nodes have to have the same multicast IP address and port to form the cluster.
Node 1:
heartbeat {
mode multicast
address 239.1.99.222
port 9918
interval 150
timeout 10
...
Node 2:
heartbeat {
mode multicast
address 239.1.99.222
port 9918
interval 150
timeout 10
...
Node 3:
heartbeat {
mode multicast
address 239.1.99.222
port 9918
interval 150
timeout 10
...
Mesh Example
Set address
to the IP of the local interface intended for intracluster communication. This setting also controls the interface fabric will use.
Set mesh-seed-address-port
to be the IP address and heartbeat port of a node in the cluster
Node 1:
heartbeat {
mode mesh
port 3002
address 192.168.1.10
mesh-seed-address-port 192.168.1.10 3002 # OK to have local node
mesh-seed-address-port 192.168.1.11 3002
mesh-seed-address-port 192.168.1.12 3002
interval 150
timeout 10
...
Node 2:
heartbeat {
mode mesh
port 3002
address 192.168.1.11
mesh-seed-address-port 192.168.1.10 3002
mesh-seed-address-port 192.168.1.11 3002
mesh-seed-address-port 192.168.1.12 3002
interval 150
timeout 10
...
Node 3:
heartbeat {
mode mesh
port 3002
address 192.168.1.12
mesh-seed-address-port 192.168.1.10 3002
mesh-seed-address-port 192.168.1.11 3002
mesh-seed-address-port 192.168.1.12 3002
interval 150
timeout 10
…
With the default settings, a node will be aware of another node leaving the cluster within 1.5 seconds (interval 150 * timeout 10 = 1500 ms convert 1.5 seconds).
Clustering: https://www.aerospike.com/docs/architecture/clustering.html#heartbeat
Network Heartbeat Configuration: https://www.aerospike.com/docs/operations/configure/network/heartbeat/#network-heartbeat-configuration
Let me know if that helps!
Ah! Thanks. After reading, a few recommendations and questions (please excuse ahead of time, I haven’t set up clustering before for anything):
Thanks for your help.
For non-cloud environment, multicast is normally the easiest to set up. Since most of our customers run with multicast, it is the most hardened option. That said, it has been some time since I was aware of a stability issue with mesh. Mesh improved significantly since 3.3.26 so at this time I am comfortable recommending either. Also for many cloud environments, short of deploying your own network overlay, mesh is the only option. Our docs haven’t caught up with mesh no longer being treated as a second class citizen.
You can run with 1, 2, or more nodes, depends on your needs.
It is optional, but recommend, to supply your client with a list of all of the host in the cluster. Doing so would mean if the seed node provided to the client were to fail and the client were restarted then it wouldn’t be able to discover the cluster since the single node it bootstraps from is down.
The client takes care of the load balance for you. The cluster create a partition table and shares that information with the client. The client determines which node a write should go to by inspecting the partition table for the node who is master for a given partition and then sends the write to that node. By default the policy is the same for reads, but some of our clients support reading from the replica, aka prole, node. If using the option to read from replicas, the client will read from the master or any replica for a given partition in a round-robin fashion.
Got it. So I just config my client via NewClientWithPolicyAndHost
.
Ok, so I understand now that the client will load balance for me. But you mentioned some detail about the the use of partitions. Are you just elaborating on how the data is replicated across nodes or are you saying I have some additional configuration options to consider? To be clear on my use case, I simply want additional nodes for redundancy (if one goes down, all other nodes have the same full copy of all my data).
I was just elaborating.