Fabric listener on different NIC


#1

Hello, whole traffic (service + fabric) is going through eth0 even fabric is defined to bind and use eth1.

service {
                address any
                access-address 10.XXX.XXX.51
                port 3000
        }
fabric {
                address 10.XXX.XXX.40
                port 3001
        }

Could you please explain? I’m open to provide any kind of information. Just ask me to make our troubleshooting effective :wink:


Replication over a specific network interface
#2

Currently fabric binds to the same interface as what heartbeat is configured to. See our Heartbeat Configuration documentation. Specifically, the interface-address parameter.

interface-address 192.168.1.100 # IP of the NIC to use to send out heartbeat
                                # and bind fabric ports

#3

Without interface-address parameter

heartbeat {
                mode mesh
                address 10.8.40.53
                port 3002
                mesh-seed-address-port 10.8.40.54 3002
                interval 150
                timeout 3
        }

does following bind operations for port 3002

tcp        0      0 10.8.40.53:3002         0.0.0.0:*               LISTEN      22698/asd       
tcp        0      0 10.8.40.53:3002         10.8.40.52:39183        ESTABLISHED 22698/asd

with mentioned parameter it does same. Without address it binds port defined by service.

tcp        0      0 10.8.40.51:3002         0.0.0.0:*               LISTEN      25063/asd

Result: Data are still going through eth0 (RX 609.1 GB; TX 2.3 TB), eth1 (RX 3.4 MB; TX 492.0 B) is completely untouched even doc says that port is used for Intra-cluster communication (migrates, replication, etc).

Test case: Node A is under load and Node B is down, once Node A has all data loaded Node B become to be available for replication test.


#4

Hm that is strange.

Could you do an ip route get <ip> on both the remote private and public IP.

Also could you grep “cf:misc” should see a couple lines like the following:

Apr 13 2015 22:19:57 GMT: INFO (cf:misc): (id.c::119) Node ip: 172.16.245.128
Apr 13 2015 22:19:57 GMT: INFO (cf:misc): (id.c::265) Heartbeat address for mesh: 172.16.245.128

When using the interfrace-address that is different from the address in service these shouldn’t be the same.


#5

First node

eth1

ip route get 10.8.40.53
local 10.8.40.53 dev lo  src 10.8.40.53 
    cache <local> 

eth0

ip route get 10.8.40.51
local 10.8.40.51 dev lo  src 10.8.40.51 
    cache <local>

Apr 15 2015 04:46:43 GMT: INFO (cf:misc): (id.c::119) Node ip: 10.8.40.51
Apr 15 2015 04:46:43 GMT: INFO (cf:misc): (id.c::265) Heartbeat address for mesh: 10.8.40.53

Second node

eth0

ip route get 10.8.40.52
local 10.8.40.52 dev lo  src 10.8.40.52 
    cache <local> 

eth1

ip route get 10.8.40.54
local 10.8.40.54 dev lo  src 10.8.40.54 
    cache <local> 



Apr 15 2015 05:05:40 GMT: INFO (cf:misc): (id.c::119) Node ip: 10.8.40.52
Apr 15 2015 05:05:40 GMT: INFO (cf:misc): (id.c::265) Heartbeat address for mesh: 10.8.40.54

#6

Ok, so this helps a bit though I guess I wasn’t very clear on how to run the ip route get commands.

These lines tell me that the Aerospike process is binding correctly to both NICs. The client interface is being bound to 10.8.40.51 and the cluster inteface to 10.8.40.53 in this case.

Even though we are bound to these interfaces we cannot change which interface Linux will use to route traffic and it appears that all of your interfaces are on the same subnet.

If from the First node you run:

ip route get 10.8.40.52
ip route get 10.8.40.54

I think you will see that Linux will route both of these through the same interface.

Typically if you are trying to separate client and cluster traffic you would have the client facing NIC on a network that the clients can route to and the cluster facing NIC on a network that the client facing NIC cannot route to.

Alternatively you could configure static routes.


#7

Thank you for your time. Your previous answer kick me little bit to check routing table to see what you describe now. I was playing with static routes yesterday, but I finally decide to use second node (server) from different VLAN to make it easier. From my point of view can be very useful to mention in your docs something about routing as small kick for other people :wink:


#8

Could you describe your environment and/or why you were in this situation?

Not sure how I would work such an issue into the current docs without understanding the reason for your situation.


#9

That’s simple. I’m testing Aerospike in one of our testing environments in fact that every environment has own VLAN due to isolation. Therefore both interfaces are in same VLAN and OS is automatically routing traffic through default gateway resp. eth0.

0.0.0.0         10.8.40.254     0.0.0.0         UG    100    0        0 eth0
10.8.40.0       0.0.0.0         255.255.255.0   U     0      0        0 eth0
10.8.40.0       0.0.0.0         255.255.255.0   U     0      0        0 eth1

At first I did not realize it, my mistake. I tried static routes by definition /etc/sysconfig/static-routes like

eth1 net 10.8.40.54 netmask 255.255.255.0 gw 10.8.40.254

but without success, resp. it results the same as above. Finally asked network team to allow comm/cabling between different vlans. Will test as possible and let you know of course.