Actual limits of a SortedMap CDT

cdt
map

#1

Hi there.

We used to store very large sets of negative numbers in Large Lists, but due to a known bug we had to migrate to a new SortedMap CDT.

I made a testing tool that populated a SortedMap with over 100K items, validated the correctness of its output on rank operations and proceeded with converting our dataset residing in LLISTs to SortedMaps. After adding about 25M items to different SortedMap bins located in different records of the same set I started to see i/o timeout responses on put operations. So I made a new clear Aerospike installation and was able to reproduce the same issue that occurred after the same threshold of 25M items.

Is there any known limitation that I’ve hit or is it a bug?

Best regards, Gregory


#2

25M records? each record contains a map type bin? each map has how many key-value pairs? what map policy used?


#3

There is a single namespace with a set in it with plenty of records. On the first server I tried the set had plenty of records already (no CDT though), the second server that I got to try and reproduce the issue was a brand new Docker installation.

I store one map per record due to record size limitations, the map is stored in the same bin name every time. Max number of items (key-value pairs) in a map is limited to 100K (to avoid hitting the record limit of ~176K) but this limit was never reached during the test. 25M items were added to maps located in different records in this set (3.3M records were created during the second test on a clean installation).

Map policy: key ordered, write mode is update.

Keys in maps are negative 64-bit integers, values are nil


#4

single node cluster? what is RAM and disk size? namespace config? storage engine memory or disk or both? are you running out of capacity? write-block-size 1MB?


#5

Yes, single node. RAM usage is at 650MB, slightly increases when I try adding more values to a map. Disk usage is at 176MB. CPU is at 100% load when new items are added, but that’s expected.

Namespace config (mostly defaults):

namespace test {
	replication-factor 2
	memory-size 1G
	default-ttl 14d
	ldt-enabled true

	storage-engine device {
		file /opt/aerospike/data/api.dat
		filesize 4G
		data-in-memory true # Store data in memory in addition to file.
	}
}

write-block-size is default (1MB)


#6

Can you post output of:

$grep thr_nsup /var/log/aerospike/aeropsike.log  

– i am suspecting you are hitting RAM high water mark and evicting records.


#7

I did some more debugging on the issue and it turned out that I was wrong about the cause of these timeouts initially. It’s not about the total number of key-value pairs in all the maps, but rather related to a single map population. Sorry about all the confusion.

When a map is already big and you try to add more items to it than it could actually fit, the add_items operation fail with a timeout error instead of a more specific error.

It’s extremely easy to reproduce: create a record with a map bin, use add_items to add 100K items to it, then try adding another 100K items.