Hello, whole traffic (service + fabric) is going through eth0 even fabric is defined to bind and use eth1.
service {
address any
access-address 10.XXX.XXX.51
port 3000
}
fabric {
address 10.XXX.XXX.40
port 3001
}
Could you please explain?
I’m open to provide any kind of information. Just ask me to make our troubleshooting effective
Currently fabric binds to the same interface as what heartbeat is configured to. See our Heartbeat Configuration documentation. Specifically, the interface-address
parameter.
interface-address 192.168.1.100 # IP of the NIC to use to send out heartbeat
# and bind fabric ports
Without interface-address
parameter
heartbeat {
mode mesh
address 10.8.40.53
port 3002
mesh-seed-address-port 10.8.40.54 3002
interval 150
timeout 3
}
does following bind operations for port 3002
tcp 0 0 10.8.40.53:3002 0.0.0.0:* LISTEN 22698/asd
tcp 0 0 10.8.40.53:3002 10.8.40.52:39183 ESTABLISHED 22698/asd
with mentioned parameter it does same. Without address
it binds port defined by service.
tcp 0 0 10.8.40.51:3002 0.0.0.0:* LISTEN 25063/asd
Result:
Data are still going through eth0 (RX 609.1 GB; TX 2.3 TB), eth1 (RX 3.4 MB; TX 492.0 B) is completely untouched even doc says that port is used for Intra-cluster communication (migrates, replication, etc).
Test case: Node A is under load and Node B is down, once Node A has all data loaded Node B become to be available for replication test.
Hm that is strange.
Could you do an ip route get <ip>
on both the remote private and public IP.
Also could you grep “cf:misc” should see a couple lines like the following:
Apr 13 2015 22:19:57 GMT: INFO (cf:misc): (id.c::119) Node ip: 172.16.245.128
Apr 13 2015 22:19:57 GMT: INFO (cf:misc): (id.c::265) Heartbeat address for mesh: 172.16.245.128
When using the interfrace-address
that is different from the address
in service these shouldn’t be the same.
First node
eth1
ip route get 10.8.40.53
local 10.8.40.53 dev lo src 10.8.40.53
cache <local>
eth0
ip route get 10.8.40.51
local 10.8.40.51 dev lo src 10.8.40.51
cache <local>
Apr 15 2015 04:46:43 GMT: INFO (cf:misc): (id.c::119) Node ip: 10.8.40.51
Apr 15 2015 04:46:43 GMT: INFO (cf:misc): (id.c::265) Heartbeat address for mesh: 10.8.40.53
Second node
eth0
ip route get 10.8.40.52
local 10.8.40.52 dev lo src 10.8.40.52
cache <local>
eth1
ip route get 10.8.40.54
local 10.8.40.54 dev lo src 10.8.40.54
cache <local>
Apr 15 2015 05:05:40 GMT: INFO (cf:misc): (id.c::119) Node ip: 10.8.40.52
Apr 15 2015 05:05:40 GMT: INFO (cf:misc): (id.c::265) Heartbeat address for mesh: 10.8.40.54
Ok, so this helps a bit though I guess I wasn’t very clear on how to run the ip route get
commands.
These lines tell me that the Aerospike process is binding correctly to both NICs. The client interface is being bound to 10.8.40.51 and the cluster inteface to 10.8.40.53 in this case.
Even though we are bound to these interfaces we cannot change which interface Linux will use to route traffic and it appears that all of your interfaces are on the same subnet.
If from the First node you run:
ip route get 10.8.40.52
ip route get 10.8.40.54
I think you will see that Linux will route both of these through the same interface.
Typically if you are trying to separate client and cluster traffic you would have the client facing NIC on a network that the clients can route to and the cluster facing NIC on a network that the client facing NIC cannot route to.
Alternatively you could configure static routes.
1 Like
Thank you for your time. Your previous answer kick me little bit to check routing table to see what you describe now. I was playing with static routes yesterday, but I finally decide to use second node (server) from different VLAN to make it easier. From my point of view can be very useful to mention in your docs something about routing as small kick for other people
Could you describe your environment and/or why you were in this situation?
Not sure how I would work such an issue into the current docs without understanding the reason for your situation.
That’s simple. I’m testing Aerospike in one of our testing environments in fact that every environment has own VLAN due to isolation. Therefore both interfaces are in same VLAN and OS is automatically routing traffic through default gateway resp. eth0.
0.0.0.0 10.8.40.254 0.0.0.0 UG 100 0 0 eth0
10.8.40.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
10.8.40.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
At first I did not realize it, my mistake. I tried static routes by definition /etc/sysconfig/static-routes
like
eth1 net 10.8.40.54 netmask 255.255.255.0 gw 10.8.40.254
but without success, resp. it results the same as above. Finally asked network team to allow comm/cabling between different vlans. Will test as possible and let you know of course.