Not sure that I understand this question. Typically the hb port is the same on all nodes in a cluster irrespective of multicast/unicast. There common exception is when multiple nodes are homed on a single server.
What is being configured? The listening port or the sending port or both?
Just wanted to know if I needed to set each system to have a unique port number?
The listening port.
As long as the cluster-name isn’t working for you then each cluster will need to use a distinct heartbeat port.
While I’m trying to effectively disable heartbeat, I tried to set the heartbeat interval to a larger number to cut down on the multicast traffic, BUT that makes the db take along time to start when I do a restart. That should be a background task.
Even though the service says it is Active, if I try to run “aql” it fails to find the db for a while. The length of time is related to the heartbeat interval.
When you tried to use the cluster-name setting to split the cluster can you confirm if the heartbeat protocol was set to v3? And confirm that your version of the server is 3.10.x Here is an example:
heartbeat {
mode multicast
multicast-group 239.1.99.222
port 9918
address eth1
protocol v3
# To use unicast-mesh heartbeats, remove the 3 lines above, and see
# aerospike_mesh.conf for alternative.
interval 150
timeout 10
}
Hi Lucien, The version is 3.10.0.1, but the aerospike.conf did not have a protocol or address entry. My aerospike.conf only had namespace entries modified from the installed default conf.
-Dick
Is there any way to slow down or even prevent heartbeat traffic, with either multicast or mesh? When I set the interval to longer than 150 and do a restart of the service, it takes the db a long time to respond. Starting “aql” will fail to open aerospike for a period of time proportional to the interval value. Setting it to 500 means 20-30 seconds of delay before aql can open the db.
Heartbeat frequency is controlled by the following two settings: Interval and Timeout
What are your values for each?
Here are the default setting:
The delay may be most likely due to cold-starting the database and index loading from disk. Can you confirm what type of storage-engine you are using? The community edition of Aerospike would only cold start unless storage is in memory with no persistence.
Would you be able to post the output from the logs for the first 100 lines after a restart.
Community version. Storage engine memory. No persistence wanted. Yes coldstart or service restart. The delay is only a few seconds when the interval is 150. But if the interval is 500 then the startup delay 20-30 seconds. And if I set it to 10000 then the startup is almost forever. Since I do not want a heartbeat between nodes, I would like to reduce the traffic as much as possible.
If all you need are separate 1 node cluster instances, just change the muticast-group IP and the port (mode multicast) in the heartbeat section to make sure your nodes will not find each other.
You can also use the cluster-name as Kevin suggested, but as Lucien pointed out, you would then need to start the node with protocol v3 in your heartbeat section.
You don’t want to change the heartbeat interval/timeout for preventing nodes from joining in a cluster.
If all I want are single-nodes, why would I want heartbeats at all? Why would I want any heartbeat network traffic?
This isn’t a common request.
You may be able to set the heartbeat.protocol to none in your configuration, though starting in this mode isnt part of our test matrix.
Sure, then simply change to heartbeat mode mesh and have only the node itself as seed node. You shouldn’t have any heartbeat traffic in that case. You may even be able to not specify any mesh-seed-address-port but you can just specify the node itself.
heartbeat {
mode mesh # Send heartbeats using Mesh (Unicast) protocol
port 3002 # port on which this node is listening to
# heartbeat
mesh-seed-address-port aa.bb.cc.dd 3002 # IP address for seed node in the cluster
# This IP happens to be the local node
interval 150 # Number of milliseconds between heartbeats
timeout 10 # Number of heartbeat intervals to wait before
# timing out a node
}