How to check and speed up Secondary index creation or re-building?

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

Problem Description

This article covers how to check and speed up secondary index creation and re-building.

Explanation

Secondary indices are built by scanning and indexing namespace’s record values based on the definition of the index, which lives in the smd (system metadata) subsystem. When a secondary index is created on a running server, it will be scanning the records (reducing the primary index) and inspect the values of the relevant records in order to create the secondary index. Unless the index is fully built, it will be in Write Only (WO) mode, this can be seen by below command on aql, and after fully built and available for use, it will be in Read Write (RW) mode. Secondary indexes can only be queried when they are in read-write state (RW). If not all sindex are in the RW state and a query is made against the cluster, either an error with no data is returned or inconsistent data can be returned to the caller.

Additionally, an Aerospike service restart on a cluster having secondary indices will trigger a namespace scan on the node to populate the secondary index. This impacts the overall time taken for the service to be up and running (port 3000 open) and for the node to join the cluster (as compared to a node without secondary index). Note that the node does not join the cluster unless both primary and secondary index are completely loaded in memory.

Note:

Starting with versions 4.7.0.22, 4.8.0.18, 4.9.0.15, 5.0.0.16, 5.1.0.13 and 5.2.0.5, when rebuilding secondary indexes at startup, only sets with secondary indices defined will have their records read, as opposed to all records in the namespace. This can greatly improve start up time if, for example, only a few small sets in the namespace have secondary indices defined.

Checking the status of an index via aql:

aql> show indexes
+-------+------+-----------+----------+-------+------------+------+----------+
| ns    | bin  | indextype | set      | state | indexname  | path | type     |
+-------+------+-----------+----------+-------+------------+------+----------+
| "bar" | "b1" | "NONE"    | "barset" | "WO"  | "idx_b1"   | "0"  | "STRING" |
+-------+------+-----------+----------+-------+------------+------+----------+
[10.0.33.101:3000] 1 row in set (0.007 secs)

+-------+------+-----------+----------+-------+------------+------+----------+
| ns    | bin  | indextype | set      | state | indexname  | path | type     |
+-------+------+-----------+----------+-------+------------+------+----------+
| "bar" | "b1" | "NONE"    | "barset" | "WO"  | "idx_b1"   | "0"  | "STRING" |
+-------+------+-----------+----------+-------+------------+------+----------+
[10.0.33.102:3000] 1 row in set (0.009 secs)

In this above example, the command output indicates that:

  • There are two nodes (two tables display) in the cluster.
  • Both nodes show that index creation is in the write-only state (WO).
  • Please refer to Index Monitoring to check index stat, enable/disable performance histograms for secondary indexes.

Checking the stats of an index via asadm:

Admin> asinfo -v 'sindex/bar/idx_b1' -l
aero_1:3000 (10.0.33.101) returned:
keys=2000000
entries=2000001
ibtr_memory_used=90456440
nbtr_memory_used=62000020
si_accounted_memory=152456460
load_pct=100
loadtime=14690
write_success=2000001
write_error=0
delete_success=0
delete_error=0
stat_gc_recs=0
stat_gc_time=0
query_reqs=0
query_avg_rec_count=0
query_avg_record_size=0
query_agg=0
query_agg_avg_rec_count=0
query_agg_avg_record_size=0
query_lookups=0
query_lookup_avg_rec_count=0
query_lookup_avg_record_size=0
histogram=false 

aero_2:3000 (10.0.33.102) returned:
keys=2000000
entries=2000001
ibtr_memory_used=90507872
nbtr_memory_used=62000020
si_accounted_memory=152507892
load_pct=100
loadtime=15578
write_success=2000001
write_error=0
delete_success=0
delete_error=0
stat_gc_recs=0
stat_gc_time=0
query_reqs=0
query_avg_rec_count=0
query_avg_record_size=0
query_agg=0
query_agg_avg_rec_count=0
query_agg_avg_record_size=0
query_lookups=0
query_lookup_avg_rec_count=0
query_lookup_avg_record_size=0
histogram=false

In this above example, the command output shows index stats from both nodes of the cluster:

  • keys are the unique values in the bin that are indexed (in above example there are 2000000 unique values),
  • entries are the total records that were (eligible and) indexed (in above example there are 2000001 such records).
  • keys and entries can be used to evaluate the cardinality of this index.
  • load_pct is the percentage progress of building the index.
  • loadtime is the time taken for this index to be built in milliseconds.
  • write_success is the number of entries successfully added to this index
  • histogram value is false, indicating that currently query histograms are not enabled on this index.
  • This provides a good picture of the secondary index being built/in progress.
  • Details for the above asinfo command can be found here.

Checking server log messages

To determine the time for sindex building completion, you can grep for “Sindex-ticker done” and check for the elapsed time, (similar log messages can be observed when Aerospike service restarts on a node having secondary indices) :

$> tail -f /var/log/aerospike.log |egrep 'Sindex-ticker|builder-thread'
May 15 2018 15:16:20 GMT: INFO (info): (thr_info.c:2181) Changing value of sindex-builder-threads from 1 to 1
May 15 2018 15:16:20 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=500000 si-mem-used=38162297 progress= 24% est-time=7251 ms
May 15 2018 15:16:26 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=1000000 si-mem-used=76245361 progress= 49% est-time=5119 ms
May 15 2018 15:16:34 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=1500000 si-mem-used=114191945 progress= 74% est-time=2756 ms
May 15 2018 15:16:38 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=2000000 si-mem-used=152536609 progress= 99% est-time=0 ms
May 15 2018 15:16:38 GMT: INFO (sindex): (secondary_index.c:4440) Sindex-ticker done: ns=bar si=idx_b1 si-mem-used=152536671 elapsed=18053 ms
.
.
May 15 2018 15:16:59 GMT: INFO (info): (thr_info.c:2181) Changing value of sindex-builder-threads from 1 to 4
May 15 2018 15:16:59 GMT: INFO (info): (thr_info.c:3163) config-set command completed: params context=service;sindex-builder-threads=4
May 15 2018 15:17:09 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=500000 si-mem-used=38258555 progress= 24% est-time=11583 ms
May 15 2018 15:17:13 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=1000000 si-mem-used=76283356 progress= 49% est-time=7704 ms
May 15 2018 15:17:17 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=1500000 si-mem-used=114123589 progress= 74% est-time=3889 ms
May 15 2018 15:17:21 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=2000000 si-mem-used=152549729 progress= 99% est-time=0 ms
May 15 2018 15:17:21 GMT: INFO (sindex): (secondary_index.c:4440) Sindex-ticker done: ns=bar si=idx_b1 si-mem-used=152549791 elapsed=15649 ms
.
.
May 15 2018 15:17:47 GMT: INFO (info): (thr_info.c:2181) Changing value of sindex-builder-threads from 4 to 8
May 15 2018 15:17:47 GMT: INFO (info): (thr_info.c:3163) config-set command completed: params context=service;sindex-builder-threads=8
May 15 2018 15:17:56 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=500000 si-mem-used=38180546 progress= 24% est-time=8280 ms
May 15 2018 15:17:59 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=1000000 si-mem-used=76371504 progress= 49% est-time=5639 ms
May 15 2018 15:18:03 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=1500000 si-mem-used=114332370 progress= 74% est-time=3252 ms
May 15 2018 15:18:06 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=2000000 si-mem-used=152462089 progress= 99% est-time=0 ms
May 15 2018 15:18:06 GMT: INFO (sindex): (secondary_index.c:4440) Sindex-ticker done: ns=bar si=idx_b1 si-mem-used=152462151 elapsed=12681 ms
.
.
May 15 2018 15:18:16 GMT: INFO (info): (thr_info.c:2181) Changing value of sindex-builder-threads from 8 to 16
May 15 2018 15:18:16 GMT: INFO (info): (thr_info.c:3163) config-set command completed: params context=service;sindex-builder-threads=16
May 15 2018 15:18:25 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=500000 si-mem-used=38594857 progress= 24% est-time=8523 ms
May 15 2018 15:18:28 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=1000000 si-mem-used=76382013 progress= 49% est-time=5773 ms
May 15 2018 15:18:32 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=1500000 si-mem-used=114862891 progress= 74% est-time=2959 ms
May 15 2018 15:18:35 GMT: INFO (sindex): (secondary_index.c:4418)  Sindex-ticker: ns=bar si=idx_b1 obj-scanned=2000000 si-mem-used=152484881 progress= 99% est-time=0 ms
May 15 2018 15:18:35 GMT: INFO (sindex): (secondary_index.c:4440) Sindex-ticker done: ns=bar si=idx_b1 si-mem-used=152484943 elapsed=12405 ms

Details for the above log messages can be found using - Common Log Messages

Solution

In order to speed up the rebuilding process, you can make the following changes depending on the server version that you are on.

Note that if the index is created on a running server, you can issue the command dynamically (using the asinfo command within the asadm tool). However, if you want to speed up the secondary index building time on a service restart, you can specify the configuration in the Aerospike configuration file (/etc/aerospike/aerospike.conf) under the service stanza for a static change.

For version 3.6.0 and newer:

To accelerate the sindex loading process, you can increase the Secondary index builder threads (default of 4):

asinfo -v 'set-config:context=service;sindex-builder-threads=8'

Or, for versions 3.8.2 and above directly set in the configuration file under the service stanza:

service {
    ...
    sindex-builder-threads 8
    ...
}

For server versions 3.5.15 and earlier:

Increase sindex-populator-scan-priority to accelerate an index loading process (default of 3).

asinfo -v 'set-config:context=service;sindex-populator-scan-priority=5'

Notes

Do monitor the cluster closely and prefer to increase in small increments so you do not impact your latencies too much.

In the above case, elapsed time was reduced from 18 seconds (with 1 thread) to 12 seconds with 16 threads. This was run on a 2 CPU cores system and the iostat were showing 90 %util and over 50K rsec/s without regular traffic. So your mileage may be different. You should use “iostat -x -n 5” to monitor the disk performance. It’s worth noting that in most cases, the sindex rebuild bottleneck will be the disk max IO speed.

References

More information on the above configurations:

To know more about secondary index, see the following:

Keywords

SINDEX BUILDER THREADS SPEEDUP SECONDARY INDEX

Timestamp

04/03/2018

Great read. Two questions.

  1. What is the impact on SI availability in the remaining cluster when 1) a node leaves, and, 2) subsequently when a node rejoins – in terms of records that migrate to other nodes and the SI creation and update on those records/partitions in the active cluster?
  2. Are SIs built on Replica partitions and records simultaneously as well?