Using multiple network cards


#1

How do I use multiple network cards with Aerospike?

In Amazon EC2, Aerospike runs best on Hardware Virtual Machine (HVM) instances that declare support for Enhanced Networking. Instances that use the Intel 82599 Virtual Function interface will have only one network card with a single network queue. The use of a single network queue may impact your software interrupts (SI) distribution per CPU core, as one network queue would be using a single CPU core. Bare-metal machines and Amazon EC2 instances that come with an Elastic Network Adapter (ENA) are multiqueue, but may still benefit from using additional network cards.

We can improve Aerospike network performance by separating different types of the Aerospike traffic by their respective ports, and binding that traffic to a specific network card. The goal would be to isolate client traffic from XDR traffic using different network cards over port 3000, and also separate mesh heartbeat traffic over port 3002 and fabric network traffic over port 3001. Each of these types of network traffic would use a specific network card of the instance.

In versions pre-3.10, you would need to add/modify access-address, alternate-address, heartbeat interface-address, and heartbeat address settings to use the different network interfaces depending on what you intend to do. Fabric and mesh heartbeat traffic must be on different ports of the same network card.

In versions 3.10+, you would need to add/modify access-address, alternate-access-address, heartbeat address, and fabric address settings. With heartbeat protocol v3 you would be able to use a separate NIC for each of type of Aerospike traffic, including fabric and mesh heartbeat.

Network traffic Isolation

Assuming three (in pre-3.10) or four (in version 3.10+) network cards per-node, and two clusters A and B.

  • Network cards in nodes on cluster A labeled A_eth{0-3}
  • Network cards in nodes on cluster B labeled B_eth{0-3}

Traffic Isolation Table:

In version 3.10+:

|  NIC |          Setting         | Traffic Type | Port |
|------|--------------------------|--------------|------|
|A_eth0| access-address           | Client       | 3000 |
|A_eth1| alternate-access-address | XDR          | 3000 |
|A_eth2| heartbeat > address      | Heartbeat    | 3002 |
|A_eth3| fabric > address         | Fabric       | 3001 |
|B_eth0| access-address           | Client       | 3000 |
|B_eth1| alternate-access-address | XDR          | 3000 |
|B_eth2| heartbeat > address      | Heartbeat    | 3002 |
|B_eth3| fabric > address         | Fabric       | 3001 |

In pre-3.10:

|  NIC |       Setting     | Traffic Type | Port |
|------|-------------------|--------------|------|
|A_eth0| access-address    | Client       | 3000 |
|A_eth1| alternate-address | XDR          | 3000 |
|A_eth2| interface-address | Heartbeat    | 3002 |
|A_eth2| address           | Fabric       | 3001 |
|B_eth0| access-address    | Client       | 3000 |
|B_eth1| alternate-address | XDR          | 3000 |
|B_eth2| interface-address | Heartbeat    | 3002 |
|B_eth2| address           | Fabric       | 3001 |

How do I add another NIC on a running Aerospike cluster on an Amazon EC2 deployment?

The following is a step-by-step example for adding a new nertwork interface to an EC2 instance. If you are on a node that already has multiple NICs, skip to the configuration section.

1) Get metadata from your EC2 instance (use commandline or AWS Console):

a) Check current network card stats on instance:

[ec2-user@ip-10-0-0-182 ~]$ ifconfig
eth0      Link encap:Ethernet  HWaddr 0E:F9:00:58:88:62  
          inet addr:10.0.0.182  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::cf9:ff:fe58:8862/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9001  Metric:1
          RX packets:56729 errors:0 dropped:0 overruns:0 frame:0
          TX packets:44089 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:25110351 (23.9 MiB)  TX bytes:5487096 (5.2 MiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:217 errors:0 dropped:0 overruns:0 frame:0
          TX packets:217 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1 
          RX bytes:44332 (43.2 KiB)  TX bytes:44332 (43.2 KiB)

b) Retrieve the instance metadata:

[ec2-user@ip-10-0-0-182 ~]$ curl http://169.254.169.254/latest/dynamic/instance-identity/document
{
  "devpayProductCodes" : null,
  "privateIp" : "10.0.0.182",
  "availabilityZone" : "us-east-1a",
  "version" : "2010-08-31",
  "region" : "us-east-1",
  "instanceId" : "i-074831c705d8eb129",
  "billingProducts" : null,
  "instanceType" : "c4.xlarge",
  "accountId" : "268841430234",
  "architecture" : "x86_64",
  "kernelId" : null,
  "ramdiskId" : null,
  "imageId" : "ami-b239daa4",
  "pendingTime" : "2017-01-23T23:33:05Z"
}

c) Retrieve the security group:

[ec2-user@ip-10-0-0-182 ~]$ curl http://169.254.169.254/latest/meta-data/security-groups
AWSMPMyVPCCluster-71H99AZ60CK6

2) Use AWS Console to create new network card:

a) Click on NETWORK & SECURITY -> Network Interfaces

b) Click [Create Network Interface] button

c) Select the availability-zone matching the one your instance is using

d) Select the security group for your network

e) Click [Yes Create]

f) Select the newly created Network Interface and click [Attach]

g) Select InstanceID for your running instance and click [Attach]

Create other network cards as needed, to match the traffic isolation table above.

3) Create an Elastic IP for the XDR-dedicated network card:

a) Click on NETWORK & SECURITY -> Elastic IPs

b) Click [Allocate new address] button

c) Click on [Allocate] and note the address of the new Elastic IP

d) Click on [Associate Address] after selecting the new Elastic IP

e) Select Resource Type [Network interface], choose the Network Interface of the eth1 NIC of the instance, and its IP as the Private IP.

Repeat with the other nodes of the cluster.

4) Verify that new network cards have been added:

[ec2-user@ip-10-0-0-182 ~]$ ifconfig
eth0      Link encap:Ethernet  HWaddr 0E:F9:00:58:88:62  
          inet addr:10.0.0.182  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::cf9:ff:fe58:8862/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9001  Metric:1
          RX packets:57474 errors:0 dropped:0 overruns:0 frame:0
          TX packets:44785 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:25312145 (24.1 MiB)  TX bytes:5579362 (5.3 MiB)

eth1      Link encap:Ethernet  HWaddr 0E:0A:CC:FB:AD:8A  
          inet addr:10.0.0.29  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::c0a:ccff:fefb:ad8a/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9001  Metric:1
          RX packets:9 errors:0 dropped:0 overruns:0 frame:0
          TX packets:27 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1128 (1.1 KiB)  TX bytes:2766 (2.7 KiB)

eth2      Link encap:Ethernet  HWaddr 0E:53:3F:CD:BB:02  
          inet addr:10.0.0.210  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::c53:3fff:fecd:bb02/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9001  Metric:1
          RX packets:4 errors:0 dropped:0 overruns:0 frame:0
          TX packets:19 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:830 (830.0 b)  TX bytes:2214 (2.1 KiB)

eth3      Link encap:Ethernet  HWaddr 0E:82:D3:7D:D4:BE  
          inet addr:10.0.0.87  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::c82:d3ff:fe7d:d4be/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9001  Metric:1
          RX packets:3 errors:0 dropped:0 overruns:0 frame:0
          TX packets:17 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:768 (768.0 b)  TX bytes:2058 (2.0 KiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:217 errors:0 dropped:0 overruns:0 frame:0
          TX packets:217 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1 
          RX bytes:44332 (43.2 KiB)  TX bytes:44332 (43.2 KiB)

5) If Aerospike was already running, you can verify which IP address are listening on port 3001 and 3002:

[ec2-user@ip-10-0-0-182 ~]$ netstat -anpt
(No info could be read for "-p": geteuid()=500 but you should be root.)
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name   
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      -                   
tcp        0      0 0.0.0.0:3000                0.0.0.0:*                   LISTEN      -                   
tcp        0      0 0.0.0.0:3001                0.0.0.0:*                   LISTEN      -                   
tcp        0      0 0.0.0.0:3002                0.0.0.0:*                   LISTEN      -                   
tcp        0      0 0.0.0.0:3003                0.0.0.0:*                   LISTEN      -                   
tcp        0      0 0.0.0.0:36300               0.0.0.0:*                   LISTEN      -                   
tcp        0      0 0.0.0.0:111                 0.0.0.0:*                   LISTEN      -                   
tcp        0      0 0.0.0.0:8082                0.0.0.0:*                   LISTEN      -                   
tcp        0      0 10.0.0.182:44996            10.0.0.112:3001             ESTABLISHED -                   
tcp        0   1016 10.0.0.182:22               204.153.194.234:32501       ESTABLISHED -                   
tcp        0      0 10.0.0.182:3002             10.0.0.112:46318            ESTABLISHED -                   
tcp        0      0 10.0.0.182:3001             10.0.0.112:57388            ESTABLISHED -                   
tcp        0      0 10.0.0.182:3001             10.0.0.112:57390            ESTABLISHED -                   
tcp        0      0 :::22                       :::*                        LISTEN      -                   
tcp        0      0 :::34438                    :::*                        LISTEN      -                   
tcp        0      0 :::111                      :::*                        LISTEN      -                   
tcp        0      0 :::8081                     :::*                        LISTEN      -   

Configuration

Modify your aerospike.conf to use the multiple network cards available to your nodes

In version 3.10+:

Add/modify the config parameters access-address, _alternate-access-address_ , heartbeat _address_, and fabric address:

[ec2-user@ip-10-0-0-182 ~]$ sudo vi /etc/aerospike/aerospike.conf

network {
  service {
    address any # listen on any available interface for service traffic
    port 3000
    access-address 10.0.0.182 # eth0 for service calls
    alternate-access-address 34.197.253.11 # eth1 Elastic IP value for XDR
  }

  heartbeat {
    mode mesh
    protocol v3 # required for using separate NICs for mesh heartbeat and fabric
    address 10.0.0.210 # eth2 for mesh heartbeat
    port 3002

    mesh-seed-address-port 10.0.0.210 3002 # eth2 of this node
    mesh-seed-address-port 10.0.0.246 3002 # eth2 on node B

    interval 150 # number of milliseconds between heartbeats
    timeout 15   # number of heartbeats failing in a row before timing out
  }

  fabric {
    address 10.0.0.87 # eth3 for fabric communication
    port 3001
  }

  info {
    address 127.0.0.1
    port 3003
  }
}

In pre-3.10:

Add/modify the following config access-address, alternate-address, heartbeat-address and interface-address and ensure that all your mesh-seed-address-port uses the new desired nic.

Note that access-address, heartbeat-address, and interface-address are static configuration params that can be updated by a rolling-restart. Changing the mesh-seed-address-port will require a full cluster shutdown. _alternate-address would need to be set on the destination cluster configuration and bind to the public Elastic IP and network card.

[ec2-user@ip-10-0-0-182 ~]$ sudo vi /etc/aerospike/aerospike.conf


network {
    service {
        address any
        port 3000
        access-address 10.0.0.182 # eth0 of this node
        alternate-address 34.197.253.11 # eth1 Elastic IP value for XDR
    }

    heartbeat {
        mode mesh
        address 10.0.0.210 # eth2
        port 3002
        interface-address 10.0.0.210 3002 # eth2 of this node
        mesh-seed-address-port 10.0.0.246 3002 # eth2 on node B
        interval 150
        timeout 15
    }

    fabric {
        port 3001
    }

    info {
        port 3003
    }
}

Post-Configuration Steps

1) Restart aeropsike

[ec2-user@ip-10-0-0-182 ~]$ sudo service aerospike stop
Stopping aerospike:                                        [  OK  ]
[ec2-user@ip-10-0-0-182 ~]$ sudo service aerospike start
Starting and checking aerospike:                           [  OK  ]

2) Verify the new network interface IPs used by port 3001 and 3002.

[ec2-user@ip-10-0-0-182 ~]$ netstat -anpt
(No info could be read for "-p": geteuid()=500 but you should be root.)
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name   
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      -                   
tcp        0      0 0.0.0.0:3000                0.0.0.0:*                   LISTEN      -                   
tcp        0      0 10.0.0.87:3001              0.0.0.0:*                   LISTEN      -                   
tcp        0      0 10.0.0.210:3002             0.0.0.0:*                   LISTEN      -                   
tcp        0      0 127.0.0.1:3003              0.0.0.0:*                   LISTEN      -                   
tcp        0      0 0.0.0.0:36300               0.0.0.0:*                   LISTEN      -                   
tcp        0      0 0.0.0.0:111                 0.0.0.0:*                   LISTEN      -                   
tcp        0      0 0.0.0.0:8082                0.0.0.0:*                   LISTEN      -                   
tcp        0      0 10.0.0.182:55220            10.0.0.110:3001             ESTABLISHED -                   
tcp        0      0 10.0.0.87:3001              10.0.0.112:55970            ESTABLISHED -                   
tcp        0   1016 10.0.0.182:22               204.153.194.234:32501       ESTABLISHED -                   
tcp        0      0 10.0.0.210:3002             10.0.0.112:48740            ESTABLISHED -                   
tcp        0      0 10.0.0.87:3001              10.0.0.112:55972            ESTABLISHED -                   
tcp        0      0 :::22                       :::*                        LISTEN      -                   
tcp        0      0 :::34438                    :::*                        LISTEN      -                   
tcp        0      0 :::111                      :::*                        LISTEN      -                   
tcp        0      0 :::8081                     :::*                        LISTEN      -                   

3) Verify IP address broadcast to aerospike clients:

[ec2-user@ip-10-0-0-182 ~]$ asinfo -v service
10.0.0.182:3000
[ec2-user@ip-10-0-0-182 ~]$ asinfo -v services
10.0.0.112:3000
[ec2-user@ip-10-0-0-182 ~]$ asinfo -v services-alternate
34.197.64.209:3000

Using multiple ENI in EC2
What affects performance?
#2

Why not bond them together? Sorry to revive old thread


#3

Not every environment allows you to bond the network cards. In Amazon EC2 it’s simple to add ENIs and associate them with specific types of traffic on a given NIC and port.


#4

But if you had the option, would it be better to just bond them together and use the bond?


#5

It really depends on your traffic and usage pattern. Here’s another knowledge base article about Aerospike and Network Bonding. Not making a judgement call, both are valid decisions.


#6

FYI, link aggregation progress in Amazon EC2: https://aws.amazon.com/about-aws/whats-new/2017/02/aws-direct-connect-enables-link-aggregation-group-for-additional-aws-regions/


#7

Hi all, for what I know, link aggregation (and network bonding) does not really allow for bandwidth aggregation: every network connection will be able to use only one network link, thus the bandwidth for a single connection will be limited to that of a single network link. The bandwidth aggregation is then achieved by having multiple network connections spread across the aggregated links BUT, as far as I know, this happens according to the IP of the connection destination, i.e. if you have a 2-node cluster you won’t be able to aggregate bandwidth, because all the connections from one node to the other will be on the same network link.

I have now a bandwidth problem in a 2-node cluster where the 1Gbit bandwidth gets saturated quite often. I have 2 NICs already in a network bonding, on each host. What I was thinking about is this: since the fabric traffic is already spread across multiple connections (a “netstat -antp | grep 3001” shows many connections between the two nodes), if Aerospike was able to use different fabric addresses for the various connections (for example, distributing the connections in a round-robin way) I could split the bonding, assign a different address to each NIC (on different VLANs), and let Aerospike distribute the connections on them (reaching a theoretical 2Gbit bandwidth).

Could this be a possible enhancement?

Thanks.


Distribute fabric connections across multiple network links