SubRec Open Failure


#1
Sep 09 2015 20:22:02 GMT: WARNING (udf): (/opt/aerospike/sys/udf/lua/ldt/lib_llist.lua::4083) [ERROR]<ldt_common_2014_12_20.A:openSubRec()> SubRec Open Failure: Digest(77 95 F0 08 24 F8 7C 1F E6 DA EA E7 2E 55 00 00 00 00 00 00) Parent Digest(NULL)

I get other messages with similarly strange digests:

Sep 09 2015 23:30:48 GMT: WARNING (udf): (/opt/aerospike/sys/udf/lua/ldt/lib_llist.lua::3835) [ERROR]<ldt_common_2014_12_20.A:closeSubRecDigestString()> Error closing SubRec: rc(-2) Digest(29 AA DD 26 84 AC 97 1F 7D 9C AE EE DA 58 00 00 00 00 00 00)

Is there a way to tell what this is actually pointing at? And a good way to diagnose these failures?


'SubRec Open Failure' for Large List & How to deal with old data returning after restart
#2

Dougluce,

What version are you running. And little bit about your setup

– R


#3

It’s 3.6.0-1 on el6. I’ve a cluster of eight c3.2xlarges.

I’m using llists to queue ads for individual domains on a high-volume bidding server. Items added to the list vary in size from 400 bytes to 15k, lists from 10-10000 items long. The 8-node cluster is hit 50k-80k times per second, insertions maybe 4k-5k/second.

I’m using llist.add(), llist.remove_all() for most manipulations.

What more can I tell you?


#4

dougluce,

Did you load the data directly on the 3.6.0-1 or were you running <3.5.15 build and then did rolling restart to 3.6.0-1.

We have identified a bug in the server which would cause the LDT record links to be corrupted in certain cases if the records are touched while migrations are in certain phase fo the partition to which the record belongs.

– R


#5

The data was loaded directly on 3.6.0-1. Right now, I’m restarting all the servers every 3 hours to mitigate this issue. All the data disappears on restart and is reloaded soon thereafter.

Would taking down the replication-factor from 3 to 1 make any difference?


#6

This is not function of replication factor. Are you running with data in memory true ??

I will try this out .

– R


#7

Hi Raj,

Yes, this is with data in memory.

I installed the new 3.6.1 version this morning on the cluster. Unfortunately, these errors are still showing up:

FAILURE/opt/aerospike/sys/udf/lua/ldt/ldt_common.lua:751: 1422:LDT-Sub Record Open Error

#8

@dougluce,

Thank you for posting about LDTs in our forum. Please see the LDT Feature Guide for current LDT recommendations and best practices.


#9

@dougluce,

Effective immediately, we will no longer actively support the LDT feature and will eventually remove the API. The exact deprecation and removal timeline will depend on customer and community requirements. Instead of LDTs, we advise that you use our newer List and SortedMap APIs, which are now available in all Aerospike-supported clients at the General Availability level. Read our blog post for details.