Contents of LDT Not Found

llist
python

#1

I’m seeing the following using the llist in the python client:

  1. Create an LDT (llist)
  2. Use add() or add_many() to populate it with some data
  3. At this point, the llist object can read the data back.
  4. Create another llist with same key and bin to read the data
  5. Call size() on new LDT.
  6. Size throws “AEROSPIKE_ERR_RECORD_NOT_FOUND”
  7. Data also can’t be read via find_* methods
  8. Some time later (10-60 minutes), repeat step 4. Data can now be read.

I’m running a cluster of two machines, under moderate load, using the following policy:

'policies': {
    'timeout' : 1000, # milliseconds
    'commit_level' : aerospike.POLICY_COMMIT_LEVEL_MASTER
}

Due to the commit level, I would anticipate a window where the above scenario happens, but I would expect it to be short, not over ten minutes. Is there something I’m overlooking here?

EDIT: I forgot to mention that doing a standard get on the key does return a record that looks like the following:

(('my_namespace',
  'a_set',
  None,
  bytearray(b'\x8b\x1aX\xc8\x9c\xbbu\x1c\xa8\x99\xf4(\x89\xe1\xf8\xc9@\xb6\x1c(')),
 {'gen': 1, 'ttl': 12743},
 {'LDTCONTROLBIN': None, 'data': None})

#2

Tyler,

Sorry for late response. If you are still struck. Can you elaborate what you mean by “Create”.

Aerospike runs with synchronous replication to replica. Also data is always read from master. So if you second write succedded (You are sure about it !!). You should always find the record.

– R


#3

Hi Raj,

Thanks for the response! A bug on my side was exacerbating the problem. Still, even after fixing it, I was seeing a few failures every hour. I tracked it down to the following access pattern:

  1. Check if record exists
  2. Remove if it exists
  3. Write new record

Step three was raising intermittent LDTUniqueKeyError exceptions. The exceptions went away when I switched to the following approach:

  1. Remove record
  2. Catch possible RecordNotFound exception
  3. Write new record

From this, I was getting the impression that the read for record existence might not be respecting the default read.replica policy. I was seeing some similar errors in the code that reads these records, but I’ve not had a chance to see if they reproduce after my above change–it’s possible that the record they were looking for just wasn’t written.

Can you confirm that LDTs use the same read/write policies as other operations?

Thanks! Tyler.


#4

Tyler,

They use same policy as UDF execute

– R


#5

Interesting. Any idea why I’d be seeing the inconsistent behavior then?


#6

Sorry for late response. Actually no idea why you observe such a behavior. Were you able to get past this ??

– R


#7

Yes, using the second approach I outlined above works most of the time.


#8

@Tyler,

Thank you for posting about LDTs in our forum. Please see the LDT Feature Guide for current LDT recommendations and best practices.


#9

@Tyler,

Effective immediately, we will no longer actively support the LDT feature and will eventually remove the API. The exact deprecation and removal timeline will depend on customer and community requirements. Instead of LDTs, we advise that you use our newer List and SortedMap APIs, which are now available in all Aerospike-supported clients at the General Availability level. Read our blog post for details.