Lua API returning a nil value for a bin

Lyman_Hurd · September 15, 2022, 9:26pm

I have a record UDF that is supposed to delete only records with a specified number of bins. I have a record that the Python API shows as having two bins but when I process the record via lua I get:

record.numbins(rec) → 3

and when I display the values of the bins I see (this is the unexpected bin of the 3): Sep 15 2022 21:14:20 GMT: DEBUG (udf): (/opt/aerospike/usr/udf/lua/prune_fsegs.lua:7) bin 1 name = N nil

This bin does not get returned by the Python client and I was under the impression that it was a core Aerospike principle that bins cannot contain “nil”. Is there something I am missing and do I need to iterate over bins to check if the count includes empty bins?

There is a question as to whether the record once had an “N” bin and the simple answer is I don’t know!

Cheers,

Lyman

neelp · September 16, 2022, 9:24pm

One possibility is that the states that Python and Lua are seeing are not the same, For example, the Lua has:

rec[bin] = nil
n = record.numbins(rec) – lua log shows this state
aerospike:update(rec)
– python client sees this state

As always, a reproducible case will be great to have.

Lyman_Hurd · September 19, 2022, 1:29pm

I have a suspicion but I do not have a good way to replicate the case. Fortunately this affects only a small percentage of records and even more fortunately it leads to “false negatives”, i.e., the UDF fails to delete records that we wanted deleted as opposed to false positives which would have been a showstopper.

What is odd is that when I have a record in this state the problem is reproducible in the sense that a “get” issued by AQL or the python client shows two bins and manually applying the UDF shows three including the “empty” bin which precludes a race.

We recently simplified our writes so that we simply overwrite bins but earlier were using operations on CDT’s that modified the map in place. I had a theory that possibly this left the bin in a state that lua reads as “present but empty” and the other API’s as non-existent, however I am not 100% sure I buy this as this is affecting our dev environment which we regularly truncate and so there is a limit to how old any record can be.

pgupta · September 19, 2022, 5:49pm

Are you using XDR 5.0 feature: bi-directional XDR with bin-convergence?

Lyman_Hurd · September 19, 2022, 6:09pm

No, in fact this test cluster does not actually replicate (we have XDR enabled just to be able to test dynamic configuration changes).

pgupta · September 19, 2022, 6:23pm

In the XDR config - do you have ship-bin-luts set to true? (It may not matter if you are actually shipping to destination.) what about conflict-resolve-writes true for the namespace?

Lyman_Hurd · September 21, 2022, 7:37pm

Okay, it does appear to be XDR-related. I tested 100_000 entries on four clusters and the two that do not have XDR enabled showed no examples and the two that did had examples. The background of the udf in question was that it was deciding what records it could delete and since the “failure” rate was < 0.2% it was not worth tracking down, although I could have tested the bin for nil.

To give more context our XDR config does specify: bin-policy=changed-and-specified which implies that there was a tombstone entry that some clients interpreted as no value (e.g., Python and I am guessing that aql is based on the C client) whereas the lua client thought differently.

Lyman_Hurd · September 21, 2022, 7:38pm

To answer the previous question in the interest of completeness: “In the XDR config - do you have ship-bin-luts set to true ? (It may not matter if you are actually shipping to destination.) what about conflict-resolve-writes true for the namespace?”

No and no.

Thanks.

pgupta · September 22, 2022, 3:27am

OK, that makes sense. When you want server to ship “changed” bins, and set the bin to null to delete it, server will have to keep it around to ship the bin deletion. The bin will be deleted upon the next record update after it has been shipped, and, its default life of 1 day has elapsed. If there is no subsequent record update after xdr-bin-tombstone-ttl (1 day), it will be there hanging around in the record.

Topic		Replies	Views
`rec[bin] = nil` does not work when info() called on rec[bin] afterwards User Defined Functions (UDF) udf	3	1805	March 11, 2016
Lua record.key function returns nil User Defined Functions (UDF) lua , udf	2	2115	October 15, 2018
Creating new bin for new record gives Invalid Operation	1	1334	May 20, 2015
Issues with lua udfs in calling map module udf	5	1560	June 16, 2016
Calculation with Lua on map bin User Defined Functions (UDF)	1	1459	March 27, 2015

Lua API returning a nil value for a bin

Related topics