Unable to create cluster of 2 nodes using docker images of aerospike

Hi team,

I am currently experimenting with the Aerospike Docker image (aerospike/aerospike-server) and I’m facing difficulties in setting up a simple 2-node cluster on my Mac. I am using Aerospike Community Edition build 6.3.0.2.

To recreate the issue, please follow these commands:

  1. Create a Docker network named “aerospike-network”
docker network create aerospike-network
  1. Run the first Aerospike node container
docker run -d --name aerospike-node1 --network aerospike-network aerospike/aerospike-server
  1. Run the second Aerospike node container
docker run -d --name aerospike-node2 --network aerospike-network aerospike/aerospike-server

Next, you need to modify the configuration of both Docker containers. Here are the updated configurations:

Aerospike Node-1 Configuration:

service {
}

logging {
    console {
        context any info
    }
}

network {
    service {
        address any
        port 3000
    }

    heartbeat {
        mode mesh
        address any
        port 3002
        mesh-seed-address-port aerospike-node1 3002
        mesh-seed-address-port aerospike-node2 3002
        interval 150
        timeout 10
    }

    fabric {
        address local
        port 3001
    }
}

namespace test {
    replication-factor 2
    memory-size 1G
    default-ttl 30d
    storage-engine device {
        file /opt/aerospike/data/test.dat
        filesize 4G
        data-in-memory false
        write-block-size 128K
    }
}

Aerospike Node-2 Configuration:

service {
}

logging {
    console {
        context any info
    }
}

network {
    service {
        address any
        port 3000
    }

    heartbeat {
        mode mesh
        address any
        port 3002
        mesh-seed-address-port aerospike-node1 3002
        mesh-seed-address-port aerospike-node2 3002
        interval 150
        timeout 10
    }

    fabric {
        address local
        port 3001
    }
}

namespace test {
    replication-factor 2
    memory-size 1G
    default-ttl 30d
    storage-engine device {
        file /opt/aerospike/data/test.dat
        filesize 4G
        data-in-memory false
        write-block-size 128K
    }
}

After modifying the configurations, please restart both Aerospike containers.

Here is the log information:

May 24 2023 18:30:23 GMT: INFO (as): (as.c:382) initializing services...
May 24 2023 18:30:23 GMT: INFO (service): (service.c:167) starting 10 service threads
May 24 2023 18:30:23 GMT: INFO (hb): (hb.c:6793) added new mesh seed aerospike-node1:3002
May 24 2023 18:30:23 GMT: INFO (hb): (hb.c:6793) added new mesh seed aerospike-node2:3002
May 24 2023 18:30:23 GMT: INFO (fabric): (fabric.c:791) updated fabric published address list to {127.0.0.1:3001}
May 24 2023 18:30:23 GMT: INFO (partition): (partition_balance.c:203) {test} 4096 partitions: found 0 absent, 4096 stored
May 24 2023 18:30:23 GMT: INFO (hb): (hb.c:5523) updated heartbeat published address list to {10.0.4.101:3002}
May 24 2023 18:30:23 GMT: INFO (smd): (smd.c:2342) no file '/opt/aerospike/smd/UDF.smd' - starting empty
May 24 2023 18:30:23 GMT: INFO (batch): (batch.c:814) starting 2 batch-index-threads
May 24 2023 18:30:23 GMT: INFO (health): (health.c:318) starting health monitor thread
May 24 2023 18:30:23 GMT: INFO (fabric): (fabric.c:416) starting 8 fabric send threads
May 24 2023 18:30:23 GMT: INFO (fabric): (fabric.c:430) starting 16 fabric rw channel recv threads
May 24 2023 18:30:23 GMT: INFO (fabric): (fabric.c:430) starting 4 fabric ctrl channel recv threads
May 24 2023 18:30:23 GMT: INFO (fabric): (fabric.c:430) starting 4 fabric bulk channel recv threads
May 24 2023 18:30:23 GMT: INFO (fabric): (fabric.c:430) starting 4 fabric meta channel recv threads
May 24 2023 18:30:23 GMT: INFO (fabric): (fabric.c:442) starting fabric accept thread
May 24 2023 18:30:23 GMT: INFO (hb): (hb.c:6978) initializing mesh heartbeat socket: 10.0.4.101:3002
May 24 2023 18:30:23 GMT: INFO (fabric): (socket.c:818) Started fabric endpoint 127.0.0.1:3001
May 24 2023 18:30:23 GMT: INFO (hb): (hb.c:7008) mtu of the network is 1450
May 24 2023 18:30:23 GMT: INFO (hb): (socket.c:818) Started mesh heartbeat endpoint 10.0.4.101:3002
May 24 2023 18:30:23 GMT: INFO (nsup): (nsup.c:197) starting namespace supervisor threads
May 24 2023 18:30:23 GMT: INFO (service): (service.c:942) starting reaper thread
May 24 2023 18:30:23 GMT: INFO (service): (socket.c:818) Started client endpoint 0.0.0.0:3000
May 24 2023 18:30:23 GMT: INFO (service): (service.c:199) starting accept thread
May 24 2023 18:30:23 GMT: INFO (as): (as.c:421) service ready: soon there will be cake!
May 24 2023 18:30:24 GMT: INFO (hb): (hb.c:6344) removing self seed entry host:aerospike-node1 port:3002
May 24 2023 18:30:24 GMT: INFO (hb): (hb.c:6832) removed mesh seed host:aerospike-node2 port 3002
May 24 2023 18:30:24 GMT: INFO (hb): (hb.c:4376) found redundant connections to same node (bb96504000a4202) - choosing at random
May 24 2023 18:30:24 GMT: INFO (hb): (hb.c:8581) node arrived bb96404000a4202
May 24 2023 18:30:24 GMT: INFO (fabric): (fabric.c:2580) fabric: node bb96404000a4202 arrived
May 24 2023 18:30:25 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:25 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:27 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:27 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:29 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:29 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:31 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:31 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:32 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:32 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:168) NODE-ID bb96504000a4202 CLUSTER-SIZE 0 CLUSTER-NAME null
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:242)    cluster-clock: skew-ms 0
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:263)    system: total-cpu-pct 5 user-cpu-pct 3 kernel-cpu-pct 2 free-mem-kbytes 3448904 free-mem-pct 85 thp-mem-kbytes 8192
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:285)    process: cpu-pct 2 threads (9,60,29,29) heap-kbytes (1141700,1142268,1182208) heap-efficiency-pct 100.0
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:295)    in-progress: info-q 0 rw-hash 0 proxy-hash 0 tree-gc-q 0 long-queries 0
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:319)    fds: proto (0,0,0) heartbeat (1,3,2) fabric (24,24,0)
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:328)    heartbeat-received: self 2 foreign 65
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:354)    fabric-bytes-per-second: bulk (4,4) ctrl (2,2) meta (2,2) rw (19,19)
May 24 2023 18:30:34 GMT: INFO (info): (ticker.c:413) {test} objects: all 0 master 0 prole 0 non-replica 0
May 24 2023 18:30:34 GMT: INFO (info): (ticker.c:477) {test} migrations: complete
May 24 2023 18:30:34 GMT: INFO (info): (ticker.c:504) {test} memory-usage: total-bytes 0 index-bytes 0 set-index-bytes 0 sindex-bytes 0 used-pct 0.00
May 24 2023 18:30:34 GMT: INFO (info): (ticker.c:586) {test} device-usage: used-bytes 0 avail-pct 99 cache-read-pct 0.00
May 24 2023 18:30:34 GMT: INFO (hb): (hb.c:4376) (repeated:1) found redundant connections to same node (bb96504000a4202) - choosing at random
May 24 2023 18:30:34 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:34 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:36 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:36 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:38 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:38 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:40 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:40 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)

I have also checked the netstats but found nothing wrong

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.11:42165        0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:3000            0.0.0.0:*               LISTEN      7/asd
tcp        0      0 127.0.0.1:3001          0.0.0.0:*               LISTEN      7/asd
tcp        0      0 10.0.4.101:3002         0.0.0.0:*               LISTEN      7/asd

Based on this information, could you please assist me in identifying the cause of the issue and guide me on how to successfully form the 2-node cluster? Any suggestions or recommendations would be highly appreciated.

After several additional attempts and careful examination of the logs, I have successfully identified the problem. The issue lied within the fabric section where the usage of “address: local” was hindering proper communication over port 3001. However, once I replaced the “local” address with the appropriate network interface, the problem was resolved and the system started functioning as expected.

Here is the updated configuration block:

Node -1

fabric {
		address aerospike-node1
		port 3001
	}

Node -2

fabric {
		address aerospike-node2
		port 3001
	}

Go the expected output:

root@fc6f1a797eed:/# asadm
Seed:        [('127.0.0.1', 3000, None)]
Config_file: /root/.aerospike/astools.conf, /etc/aerospike/astools.conf
Aerospike Interactive Shell, version 2.14.0

Found 2 nodes
Online:  10.0.4.108:3000, 10.0.4.107:3000
1 Like

This topic was automatically closed 84 days after the last reply. New replies are no longer allowed.