Error registering Lua UDFs

lua
udf
java
error

#1

Hello,

I am having a problem with the lua registration using aerospike-client-java with :

mod-lua { cache-enabled true }

Every time I try to register a huge amount of lua files (our system uses a dynamic schema based on xml files and lua files are generated automatically based on the xml file), it ends up failing pretty much at the end of the process registration. This is the log error:

Nov 25 2016 17:51:43 GMT: WARNING (as): (signal.c:181) SIGSEGV received, aborting Aerospike Community Edition build 3.10.1 os el7 Nov 25 2016 17:51:43 GMT: WARNING (as): (signal.c:185) stacktrace: found 12 frames Nov 25 2016 17:51:43 GMT: WARNING (as): (signal.c:185) stacktrace: frame 0: /usr/bin/asd(as_sig_handle_segv+0x35) [0x4a7e65] Nov 25 2016 17:51:43 GMT: WARNING (as): (signal.c:185) stacktrace: frame 1: /lib64/libc.so.6(+0x35670) [0x7f9eb41a6670] Nov 25 2016 17:51:43 GMT: WARNING (as): (signal.c:185) stacktrace: frame 2: /usr/bin/asd(lua_pushcclosure+0x9) [0x5834b9] Nov 25 2016 17:51:43 GMT: WARNING (as): (signal.c:185) stacktrace: frame 3: /usr/bin/asd(luaL_openlibs+0x26) [0x592bf6] Nov 25 2016 17:51:43 GMT: WARNING (as): (signal.c:185) stacktrace: frame 4: /usr/bin/asd() [0x567dab] Nov 25 2016 17:51:43 GMT: WARNING (as): (signal.c:185) stacktrace: frame 5: /usr/bin/asd(cache_init+0x11e) [0x569d58] Nov 25 2016 17:51:43 GMT: WARNING (as): (signal.c:185) stacktrace: frame 6: /usr/bin/asd() [0x56a221] Nov 25 2016 17:51:43 GMT: WARNING (as): (signal.c:185) stacktrace: frame 7: /usr/bin/asd() [0x4eb8f9] Nov 25 2016 17:51:43 GMT: WARNING (as): (signal.c:185) stacktrace: frame 8: /usr/bin/asd(udf_cask_smd_accept_fn+0x21) [0x4ebb43] Nov 25 2016 17:51:43 GMT: WARNING (as): (signal.c:185) stacktrace: frame 9: /usr/bin/asd(as_smd_thr+0x11c5) [0x4be38a] Nov 25 2016 17:51:43 GMT: WARNING (as): (signal.c:185) stacktrace: frame 10: /lib64/libpthread.so.0(+0x7dc5) [0x7f9eb537cdc5] Nov 25 2016 17:51:43 GMT: WARNING (as): (signal.c:185) stacktrace: frame 11: /lib64/libc.so.6(clone+0x6d) [0x7f9eb4267ced]

If I disable the lua cache the registration works perfectly fine.

Therefore I have tried to disable the cache, perform the registration, stop aerospike process, enable the cache, start aerospike process. But this is not good enough. It usually throws errors the first one or two times I try to restart it, and when it finally manages to start, I can see some lua files are missing from the previous registration.

If it’s worth anything, I perform this registration in multiple threads to speed up the task.

Do you have any ideas of what the problem could be?

Kind Regards, Jose Ignacio Acin Pozo


#2

This is an Aerospike Server issue. Currently the UDF registration is generally recommended to be used in low frequency. Maybe an alternative to generating dynamic udf’s is to use parameters passed in to a static udf? Also, if all udf operations are stopped before registration, does this improve? We will try to look into this issue more.


#3

We have a lot of auto-generated lua files to register (one example has more than 800) so registering them with low frequency would not be feasible for day to day usage of the system. Also because a change in our dynamic schema would mean a change in our lua files.

I will have a look and try to see if we could implement some sort of static udf to help with the problem.

I have no other udf operations running other than udf registration tasks when the error happens, so I think we can discard that potential concurrency problem.

Thanks for looking at it :slight_smile:


#4

I have tried to reimplement our auto-generated files in a different way and now I am only generating a single lua file.

But I still have issues when I try to run some integration tests against our database. Different integration tests require different generated lua files, so every now and then the tests remove the previous udf and register a new one in its place.

The error is now different and it has nothing to do with the problem I had before::

org.luaj.vm2.LuaError: aerospike:160 function not found stack traceback: [Java]: in ?

Java stacktrace:

[org.luaj.vm2.lib.BaseLib$error.call(Unknown Source), org.luaj.vm2.LuaClosure.execute(Unknown Source), org.luaj.vm2.LuaClosure.onInvoke(Unknown Source), org.luaj.vm2.LuaClosure.invoke(Unknown Source), org.luaj.vm2.LuaValue.invoke(Unknown Source), com.aerospike.client.lua.LuaInstance.call(LuaInstance.java:130), com.aerospike.client.query.QueryAggregateExecutor.runThreads(QueryAggregateExecutor.java:104), com.aerospike.client.query.QueryAggregateExecutor.run(QueryAggregateExecutor.java:77), java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142), java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617), java.lang.Thread.run(Thread.java:745)]

More details I could get for the last stack trace call with known source:

declaringClass = "com.aerospike.client.lua.LuaInstance" 5 = {StackTraceElement@4535} "com.aerospike.client.lua.LuaInstance.call(LuaInstance.java:130)" methodName = "call" fileName = "LuaInstance.java" lineNumber = 130 functionName = “apply_stream”

Do you have any ideas?

Kind Regards, Jose Ignacio Acin Pozo


#5

Nevermind, I found the issue and I fixed it by adding some logic to the Java client API which allows the users of the library to clear the lua packages so they can be reloaded again.

These changes will be part of the next release: Lua cache problems