Aerospike Community Edition Crashing (AER-3780) [Released] [Resolved]


#1

Attempting to run UDF scan with node.js client, gets this error

{"code":1,"message":"AEROSPIKE_ERR_SERVER","func":"as_scan_parse_records","file":"src/main/aerospike/aerospike_scan.c","line":104}

The aerospike.log has the following:

Jul 15 2015 03:39:05 GMT: INFO (scan): (thr_tscan.c::871) scan job received
Jul 15 2015 03:39:05 GMT: INFO (scan): (thr_tscan.c::922) scan_option 0x0 0x64
Jul 15 2015 03:39:05 GMT: INFO (scan): (thr_tscan.c::979) NO bins specified select all
Jul 15 2015 03:39:05 GMT: INFO (scan): (thr_tscan.c::1013) scan option: Fail if cluster change False
Jul 15 2015 03:39:05 GMT: INFO (scan): (thr_tscan.c::1014) scan option: Background Job False
Jul 15 2015 03:39:05 GMT: INFO (scan): (thr_tscan.c::1015) scan option: priority is 0 n_threads 1 job_type 1
Jul 15 2015 03:39:05 GMT: INFO (scan): (thr_tscan.c::1016) scan option: scan_pct is 100 
Jul 15 2015 03:39:05 GMT: WARNING (scan): (thr_tscan.c::1020) not starting scan 2054883107 because rchash_put() failed with error -4
Jul 15 2015 03:39:05 GMT: INFO (tsvc): (thr_tsvc.c::451) Scan failed with error -2
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::160) SIGSEGV received, aborting Aerospike Community Edition build 3.5.14
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: found 16 frames
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 0: /usr/bin/asd(as_sig_handle_segv+0x59) [0x46e768]
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 1: /lib64/libc.so.6(+0x326a0) [0x7f7dbee706a0]
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 2: /lib64/libpthread.so.0(pthread_mutex_lock+0) [0x7f7dbfc973a0]
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 3: /usr/bin/asd(as_index_get_vlock+0x16) [0x45a40a]
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 4: /usr/bin/asd(as_record_get+0xcd) [0x46b8dd]
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 5: /usr/bin/asd(udf_record_open+0xd0) [0x4c3f06]
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 6: /usr/bin/asd(as_aggr_istream_read+0x1c8) [0x44efab]
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 7: /usr/bin/asd() [0x5219d8]
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 8: /usr/bin/asd() [0x541da8]
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 9: /usr/bin/asd(lua_pcall+0x30) [0x5301b0]
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 10: /usr/bin/asd() [0x51b817]
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 11: /usr/bin/asd() [0x51c047]
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 12: /usr/bin/asd(as_aggr__process+0x271) [0x44f461]
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 13: /usr/bin/asd(tscan_partition_thr+0x35e) [0x4bbc75]
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 14: /lib64/libpthread.so.0(+0x79d1) [0x7f7dbfc959d1]
Jul 15 2015 03:39:05 GMT: WARNING (as): (signal.c::162) stacktrace: frame 15: /lib64/libc.so.6(clone+0x6d) [0x7f7dbef268fd]

As stated in my previous topic, there appears to be an issue with

asinfo -v "set-config:context=namespace;id=<name space>;set=<set name>;set-delete=true;"

As records get deleted, but a reboot brings them back. So I suspect that function somehow corrupted dataset?

Here's my LUA function
function updateUserId(rec,newUserId)
	rec['userId'] = 'test'
    aerospike:update(rec)
end

Nothing too crazy.


#2

I’ve re-initialized the .dat file for the namespace. So I had 1 crash with my UDF tests, now can’t replicate the crash. However new issue popped up.

This UDF, I have it running 3 times, the 1st time there’s no errors, the 2nd and 3rd produces an error

function updateUserId(rec,oldUserId,newUserId)
    rec['userId'] = newUserId
    aerospike:update(rec)
end

results in client side error but not a total crash. Here’s client side error:

{"code":1,"message":"AEROSPIKE_ERR_SERVER","func":"as_scan_parse_records","file":"src/main/aerospike/aerospike_scan.c","line":104}

looks like 1st run updates records properly.

Now I’ve added on client side added filters into statement. There’s no more AEROSPIKE_ERR_SEVER. However, now getting node.js error on this function.

queryStream.on('data', function(rec) {
    	// Received record with results, we only need one and should have one
        deferred.resolve(rec);
});

TypeError: Cannot read property ‘on’ of null.

So looks like if filters have no records for UDF to process, node.js client has a separate bug in it of not returning any objects if there were no records whatsoever.

Sorry, I know it’s off topic. We’re about to deploy our app on Aerospike and right now very nervous about dealing with these issues and what else I’m about to uncover this week.


#3

The crash is a scan aggregation regression we’ve recently discovered. We will be putting out a patch as soon as we can.


#4

@denisbetsi,

We’ve filed JIRA number AER-3780 for this. Please stay tuned for updates on our progress.


#5

Thanks, it happened again few times since then. I hope on live d.b. this won’t be happening.


#6

@denisbetsi,

We just released Aerospike Server Community Edition 3.5.15. It’s available for download here.

Among other things, this release fixes AER-3780, the regression of scan aggregations introduced in v3.5.8.

You can view the full release notes of v.3.5.15 here.


#7

Would you please update https://vagrantcloud.com/aerospike/boxes/centos-6.5

Library so that installation on mac through Vagrant is possible. Currently the install instructions on Mac lead to older 3.5.14 version.


#8

@denis,

It usually takes a few days for the latest server version to display on Vagrant Cloud. It should be up by end of day Monday.


#9

@denis,

Good news! The 3.5.15 server version is now displaying on Vagrant Cloud.

Have a great weekend!

Maud