Are you setting access-address in your config file?
I was able to improve performance by using –endpoint-mode dnsrr when creating the service. If not, the default load balancer IP(10.0.1.2) gets added to the container’s erth0 network card and could cause timeouts.
About the latency for the 20 inserts. Could you run the following command to get latency of your transactions and the ip addresses that are published from each nodes:
Are you running the test for the 100 inserts on a single docker container or still a cluster of 2 nodes?
Your service list still shows 2 published IP addresses: 172.18.0.3 and 10.0.0.x
Could you set the access-address setting in aerospike.conf to use the eth0 interface. And also set address for both fabric and heartbeat.
network {
service {
address any
port 3000
# Uncomment the following to set the `access-address` parameter to the
# IP address of the Docker host. This will the allow the server to correctly
# publish the address which applications and other nodes in the cluster to
# use when addressing this node.
access-address eth0
}
heartbeat {
address eth0
# mesh is used for environments that do not support multicast
mode mesh
port 3002
# use asinfo -v 'tip:host=<ADDR>;port=3002' to inform cluster of
# other mesh nodes
interval 150
timeout 10
}
fabric {
address eth0
port 3001
}
info {
port 3003
}
}
The show latency output show 20% of your writes are > 64ms as rguo mentioned, the latency may be coming from your drive. It may be worth testing with ssd storage or in memory.
I was able to replicate latency when address binding is not specified in the network servcie stanza and node-id-interface is not specified.
Recommendation would be for you to overwrite default setting for address and set node-id-interface as so:
Sample of aerospike.conf :
service {
user root
group root
paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
pidfile /var/run/aerospike/asd.pid
service-threads 4
transaction-queues 4
transaction-threads-per-queue 4
proto-fd-max 15000
node-id-interface eth0
}
logging {
# Log file must be an absolute path.
file /var/log/aerospike/aerospike.log {
context any info
}
# Send log messages to stdout
console {
context any info
}
}
network {
service {
address eth0
port 3000
# Uncomment the following to set the `access-address` parameter to the
# IP address of the Docker host. This will the allow the server to correctly
# publish the address which applications and other nodes in the cluster to
# use when addressing this node.
# access-address eth0
}
heartbeat {
# address eth0
# mesh is used for environments that do not support multicast
mode mesh
port 3002
# use asinfo -v 'tip:host=<ADDR>;port=3002' to inform cluster of
# other mesh nodes
interval 150
timeout 10
}
fabric {
# address eth0
port 3001
}
info {
port 3003
}
}
Main changes are to address in network->service section is set to eth0
and node-id-interface is also set eth0 within the service stanza. Also remember to wait for migrations to finish prior to starting your test.
docker@node2:~$ docker exec -ti 9822 asadm -e info
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Network Information~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Node Node Ip Build Cluster Cluster Cluster Principal Client Uptime
. Id . . Size Key Integrity . Conns .
9822d2386d22:3000 BB90201000A4202 10.0.1.2:3000 C-3.14.1.1 2 D925CFD35434 True BB90301000A4202 6 00:13:25
aerospike.2.2lnyut8p0we7xwgu7rbkfa9dn.prod:3000 *BB90301000A4202 10.0.1.3:3000 C-3.14.1.1 2 D925CFD35434 True BB90301000A4202 5 00:13:25
Number of rows: 2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Information~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace Node Avail% Evictions Master Replica Repl Stop Pending Disk Disk HWM Mem Mem HWM Stop
. . . . (Objects,Tombstones) (Objects,Tombstones) Factor Writes Migrates Used Used% Disk% Used Used% Mem% Writes%
. . . . . . . . (tx,rx) . . . . . . .
test 9822d2386d22:3000 99 0.000 (4.402 K, 0.000) (4.204 K, 0.000) 2 false (0.000, 0.000) 2.101 MB 1 50 1.092 MB 1 60 90
test aerospike.2.2lnyut8p0we7xwgu7rbkfa9dn.prod:3000 99 0.000 (4.204 K, 0.000) (4.402 K, 0.000) 2 false (0.000, 0.000) 2.101 MB 1 50 1.092 MB 1 60 90
test 0.000 (8.606 K, 0.000) (8.606 K, 0.000) (0.000, 0.000) 4.202 MB 2.183 MB
Number of rows: 3
Although I think your particular issue is the 2 node network having cluster integrity issues because each nodes are using the same nodeID. The changes in address and node-id-interface should help.