Record size limit with many LDT Bins


#1

Hi,

We are using Large Data Type Bins to store Strings (User Id / 36 bytes each) on a daily basis. Each day a new bin is created and users are added to the LList. Example:

20151020 => LList: { user1, user2, user3 .. user n }
20151021 => LList: { user1, user4, user10 .. user n }

Each day can potentially hold millions of users. Our understanding of Large Lists is that it is not bound to its parent record with regards to size limit so it should not be a problem in this case. The namespace is configured with a 128K block size and the Set that stores those users contains another 5 bins, which are quite small in size.

This particular case requires a variable number of LDT bins but the plan is to store up to 6 months worth of users data (~180 bins). However after about 20 days (i.e. 20 LDT bins) we began seeing errors in the logs:

1427: LDT-TOP Record Update Error

We initially thought that Aerospike was trying to update (e.g. add users) records on the LList, which turned out to be incorrect as in fact it failed on adding a new LDT bin. We then tried to add a normal bin (String) which also failed but this time with a different error message (AEROSPIKE_ERR_RECORD_TOO_BIG).

Has anyone come across with similar issues with LDTs? We are obviously hitting some sort of limit but it is not clear to us whether our approach is wrong or the product has limitations/bugs relating to large number of LDT bins.

The version we are running is “Aerospike Community Edition build 3.5.15”.


#2

What is your write-block-size (a.k.a. “max record size”) set to (default should be 128kb) ? LDT’s can grow inifinite, however all records are limited to that block size. The general overhead for an LDT bin was around 220 bytes when I last measured it (all meta seems to be stored in record?). However, there is a ‘compact mode’ for LLISTs (a.k.a. “root node”) that will store everything within the records itself below a certain threshold and I think this is causing your problem. Guess this is a configurable setting (LDT configuration via UDF) - but you need to take a deep dive into the code to find that, I guess…

However, your concept will always eventually reach the max record size no matter what. Your way to go is to create more records. Easiest would be to have 1 record per LDT but something like 1 record per week (max 7 LDTs) should be most suitable for the job.

Note: If you have high throughput (peak >100/sec on LDT bins) you should measure whether parallel operations on multiple LDTs within the same record will interfere with eachother (Lock shared with parent record). Hope I could help you with your LDT setup because it’s one of the coolest features we’ve encountered so far.

Cheers, Manuel


#3

Thanks, it seems to make sense as outlined here http://www.aerospike.com/docs/architecture/llist_details.html. Going to run some tests and see if the problem gets fixed by adding a LUA module which update those settings.

EDIT: Apparently only a limited number of settings can be changed through a UDF as per the commentary in the code (ldt/lib_llist). The LUA example was taken from http://www.aerospike.com/docs/guide/ldt_advanced.html which by the way fails as the LUA module has been renamed apparently (ldt/settings_llist)

The values we expect to see in the configMap will be one or more of the following values. For any value NOT seen in the map, we will use the published default value.

  • MaxObjectSize :: the maximum object size (in bytes).
  • MaxKeySize :: the maximum Key size (in bytes).
  • WriteBlockSize:: The namespace Write Block Size (in bytes)
  • PageSize :: Targetted Page Size (8kb to 1mb)
  • KeyUnique :: If key is unique default: Unique

It seems that the best option to overcome this limitation is to change the PK and distribute daily users across multiple records.

Thanks for the help.


#4

In case you are still interested in changing the threshhold (would not go that way): I think it’s possible to set “LS.Threshold” to 0 because of the comments in line 575: https://github.com/aerospike/aerospike-lua-core/blob/master/src/ldt/lib_llist.lua . Not sure how, though. The value seems to represent an item count but it would be weird if you are hitting the record max size with just 10 elements per llist… That would equal to 360 bytes of data (worst-case) from what you stated above on list entry size. Even with 20 bins of 360b data + 220b meta that would only be 11kb.

AS is hiding some undocumented gems within it’s source code. Though one has to mind that those features might be silently changed or bugged like non-unique keys for llist’s. Anyways, reducing LDT count per record seems to be the best-fit from what I understood about your deployment. You might also have encounted a bug within LLIST-implementation, so it might be worth creating a minimal failing example and ask for a comment by one of the LDT maintainers…

Cheers, Manuel


#5

@danielwunderlich and @ManuelSchmidt:

Thank you for posting about LDTs in our forum. Please see the LDT Feature Guide for current LDT recommendations and best practices.


#6

@danielwunderlich and @ManuelSchmidt:

Effective immediately, we will no longer actively support the LDT feature and will eventually remove the API. The exact deprecation and removal timeline will depend on customer and community requirements. Instead of LDTs, we advise that you use our newer List and SortedMap APIs, which are now available in all Aerospike-supported clients at the General Availability level. Read our blog post for details.


#7

Hello @Mnemaudsyne, I have an Object with 14 bins, so after research I can grow performance up by using Map Type ( group 14 bins into one bin with 14 key-pair) . But new bin cann’t contains whole 14 bins data , I think the value of record is limited size .

This is my origin object :

QoSNetworkSummary {id,country, city.....) : 14 bins
When I writed all objects data to aerospike : 
id     country   city   ....
1       VN       HN    .....

After I used Map Type I want to like this:

data ( bin name)
{"id":1, "country":"VN", "city":HN...} 

But I received : 
data
{"id":1, "country""VN","city

The data not enought.

Please tell me how to fix this issue !

Thanks