Simultaneous scans of different sets from one namespace (3.6.0)


#1

Hi!

Is it possible to have scans of several different sets from one namespace running simulteniously? Maybe it is just my configuration problem. Please look.

My situation:

I have a namespace called “ssd” with two sets “1” and “2”. I want to scan all records from ssd.1 in one thread and all records from ssd.2 in another thread using java client and do it in a parallel fashion.

The problem is in the “in a parallel fashion” part. When I start both scans only one of them is really running and transfering data to the client, the second scan is waiting for the first to be completed and only then it starts to do its job.

I have aerospike server 3.3.19 and java client 3.0.31. I’m doing scans with the default scanning policy.

My configuration:

allow-inline-transactions true
auto-dun false
auto-undun false
batch-max-requests 5 000
batch-priority 200
batch-threads 4
dump-message-above-size 134 217 728
enable-fastpath true
enable-security false
enable-xdr false
fabric-port 3 001
fb-health-bad-pct 0
fb-health-good-pct 50
fb-health-msg-per-burst 0
fb-health-msg-timeout 200
heartbeat-interval 100
heartbeat-mode multicast
heartbeat-port 9 918
heartbeat-protocol v2
heartbeat-timeout 10
info-threads 16
max-msgs-per-type -1
memory-accounting false
microbenchmarks false
migrate-max-num-incoming 256
migrate-priority 40
migrate-read-priority 10
migrate-read-sleep 500
migrate-rx-lifetime-ms 60 000
migrate-threads 4
migrate-xmit-hwm 10
migrate-xmit-lwm 5
migrate-xmit-priority 40
migrate-xmit-sleep 500
network-info-port 3 003
nsup-period 480
nsup-queue-escape 40
nsup-queue-hwm 10
nsup-queue-lwm 1
nsup-startup-evict true
paxos-max-cluster-size 32
paxos-protocol v3
paxos-recovery-policy manual
paxos-retransmit-period 5
paxos-single-replica-limit 1
pidfile /var/run/aerospike/asd.pid
privilege-refresh-period 300
prole-extra-ttl 0
proto-fd-idle-ms 60 000
proto-fd-max 50 000
proxy-hist-track-back 1 800
proxy-hist-track-slice 10
proxy-hist-track-thresholds 1,4,8,16,32,64,128,256,512,1024,2048,4096,8192
query-batch-size 100
query-bufpool-size 256
query-hist-track-back 1 800
query-hist-track-slice 10
query-hist-track-thresholds 1,4,8,16,32,64,128,256,512,1024,2048,4096,8192
query-in-transaction-thread 0
query-job-tracking false
query-long-q-max-size 500
query-priority 10
query-rec-count-bound 4 294 967 295
query-req-in-query-thread 0
query-req-max-inflight 100
query-short-q-max-size 500
query-sleep 1
query-threads 6
query-threshold 10
query-worker-threads 15
query_rec_count-hist-track-back 1 800
query_rec_count-hist-track-slice 10
query_rec_count-hist-track-thresholds 1,4,8,16,32,64,128,256,512,1024,2048,4096,8192
reads-hist-track-back 1 800
reads-hist-track-slice 10
reads-hist-track-thresholds 1,4,8,16,32,64,128,256,512,1024,2048,4096,8192
replication-fire-and-forget false
report-authentication-sinks 0
report-sys-admin-sinks 0
report-user-admin-sinks 0
report-violation-sinks 0
respond-client-on-master-completion true
reuse-address true
scan-priority 200
scan-sleep 1
service-address 0.0.0.0
service-port 3 000
sindex-data-max-memory 18 446 744 073 709 551 616
sindex-populator-scan-priority 3
snub-nodes false
stop-writes-noxdr false
storage-benchmarks false
syslog-local -1
ticker-interval 10
transaction-duplicate-threads 0
transaction-max-ms 1 000
transaction-pending-limit 30
transaction-queues 12
transaction-repeatable-read false
transaction-retry-ms 1 000
transaction-threads-per-queue 3
udf-hist-track-back 1 800
udf-hist-track-slice 10
udf-hist-track-thresholds 1,4,8,16,32,64,128,256,512,1024,2048,4096,8192
udf-runtime-gmax-memory 18 446 744 073 709 551 616
udf-runtime-max-memory 18 446 744 073 709 551 616
use-queue-per-device false
write-duplicate-resolution-disable false
writes_master-hist-track-back 1 800
writes_master-hist-track-slice 10
writes_master-hist-track-thresholds 1,4,8,16,32,64,128,256,512,1024,2048,4096,8192
writes_reply-hist-track-back 1 800
writes_reply-hist-track-slice 10
writes_reply-hist-track-thresholds 1,4,8,16,32,64,128,256,512,1024,2048,4096,8192
xdr-delete-shipping-enabled true
xdr-nsup-deletes-enabled false

Thanks!


#2

As of now, If scan come exactly at the same time, we parallelise them. Once a scan comes, we schedule all the work done for it. Once we schedule all the work, if a new scan comes, its scheduled behind the first one. That is why you are seeing this behavior.

I understand the need to be able to support multiple scans in parallel. Filed an internal ticket to make this change.


#3

BTW, a quick update…work started and is in progress on this. Expect it soon. Keep a watch on server release notes.


#4

Ay update on when this will be available ? Thx


#5

No specific ETA yet, but the code just made it in a dev branch… so it is definitely getting there.


#6

Just to let you know, we just released Aerospike 3.6.0 which includes scan improvements and the ability to run concurrent scan jobs, plus many other fixes and features. You can read more about it on our Aerospike Release notes and grab the latest version.