Which is running on two separate servers. If run docker-compose down --rmi all and start the clusters they seem to work and join however are somehow forever doing migrations (with no data?).
My problems are:
Somehow I can start a cluster with size 0 ?
Good chance of seeing the following being spammed in the logs: WARNING (hardware): (hardware.c:2262) failed to resolve mounted device /dev/md1: 2 (No such file or directory)
Restarting a single instance essentially breaks the cluster and no longer rejoins citing:
skipping forming cluster - cannot form new cluster from pending join requests (empty)
or
join request timed out for principal bb96c64c0902500
or
ctrl ack (14): unexpected source bb96c64c0902500
The only thing which resolves this is completely deleting the image (so far) and trying again - which is not production friendly.
The other issue I see is that migrations take quite literally forever - even though there is no data in the cluster at all.
If you are using network_mode of host, what is the config for each of the aerospike nodes in the cluster. You may have to configure the service, heartbeat and fabric address to use the right interface to communicate with the other server.
Please see:
Ok, from within the container, are you able to access /dev/md1? I suspect not, this seems like an issue with hardware.c and docker containers. Out of curiosity, are your running the container with --privileged, if not could you check if the issue behaves the same with this flag - this may explain why we haven’t seen this internally.
We collect stats on the device to monitor the devices health, this is warning that we are unable to resolve this device. It should be benign to Aerospike functionality. I’ll discuss this with the maintainers of hardware.c.
Ewan, could you run the following two commands inside the container and share their output?
cat /proc/mounts
ls -l /dev
It does seem like /proc/mounts properly indicates that /opt/aerospike/data resides on /dev/md1. But it also seems like /dev does not contain a device node for md1. Hence I’d like to double-check the contents of /proc/mounts as well as what’s in /dev inside the container.