Hi. I have a cluster which had some node restarts and now it is rebalancing and reindexing since a few hours. There’s only one server left reindexing, and until it completes I can’t perform queries.
I’ve already changed sindex-builder-threads value to 16 (on a 8 vCPU server), but it hasn’t had any noticeable impact. I was wondering what was the maximum safe value to set that variable? And any other recommendations to speed up the process?
If you allow migrations to happen instead of reading from disk, you can perform queries while migrations are going on with best-effort results… For your particular problem though, it’s most likely a bottleneck on the disk. What does ‘iostat -xky 1 10’ look like, and what about your load avg ‘sar -q’?
So you are showing a high iowait% time and %util is 100% on disk sda… I’m assuming this is the disk aerospike reads from?
This is happening because you performed a cold start and either do not have your namespace set as cold-start-empty or did not blank the disk out. In some cases its desired… but if you have replicated data and just lost 1 node or are doing rolling restarts - you could technically empty the drive out and let the aerospike nodes re-replicate the data through migrations to the node. That’s what I mean. In some cases I’ve actually seen replication happen faster than secondary index building, but we had to test that to decide what was best for us… we also don’t have a use case to read from disk since we can’t tolerate zombie records and dont use tombstones. – Long story short though, for the problem you’re showing and the output of iostat - I think your problem is that your disk IO is maxed out. Not much that can be done there.
I originally had 3 nodes with 3 disks (one disk each node). Then the cluster went down, so I emptied out a disk in order to be able to get the DB up fast, knowing that, with a replication factor of 2, I’d have no data loss (if I kept the 2 other disks).
Then I added a 4th node to add overall capacity, and now 1 of the 2 original disks remaining has already built its entire index.
So basically, I’m left with this 3rd disk’s indexes being rebuilt, but I guess that to be able to zero it out with no data loss, I’d have to wait until migrations between the 4 nodes are done; right?
This is the info:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Information~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace Node Avail% Evictions Master Replica Repl Stop Pending Disk Disk HWM Mem Mem HWM Stop
. (Objects,Tombstones) (Objects,Tombstones) Factor Writes Migrates Used Used% Disk% Used Used% Mem% Writes%
. . . . . . . . (tx,rx) . . . . . . .
ns1 node1:3000 53 0.000 (43.088 M, 0.000) (18.100 M, 0.000) 2 false (2.460 K, 1.669 K) 80.627 GB 41 50 10.302 GB 65 60 90
ns1 node2:3000 56 0.000 (19.014 M, 0.000) (32.490 M, 0.000) 2 false (2.933 K, 1.569 K) 81.017 GB 41 50 7.580 GB 48 60 90
ns1 node3:3000 90 0.000 (9.887 M, 0.000) (8.800 M, 0.000) 2 false (917.000, 2.103 K) 19.721 GB 10 50 2.543 GB 16 60 90
ns1 node4:3000 84 0.000 (22.434 M, 0.000) (9.125 M, 0.000) 2 false (790.000, 1.759 K) 30.366 GB 16 50 3.909 GB 25 60 90
ns1 0.000 (94.424 M, 0.000) (68.515 M, 0.000) (7.100 K, 7.100 K) 211.731 GB 24.333 GB
Number of rows: 5
Now that you mention it, I’m running on Docker Swarm, so AS can’t access devices directly, and I have to use the ‘file’ storage engine. Would it have a better performance with ‘device’ storage engine, even though it would be the same kind of disk?