SIGSEGV on 3.6.3 and 3.6.4 Community Edition


#1

Aerospike is crashing with SIGSEGV when executing a stream UDF:

Dec 07 2015 13:20:33 GMT: INFO (drv_ssd): (drv_ssd.c::2088) device /opt/aerospike/data/ccr-aws-prod.dat: used 44900782592, contig-free 161778M (161778 wblocks), swb-free 0, w-q 0 w-tot 0 (0.0/s), defrag-q 0 defrag-tot 0 (0.0/s) defrag-w-tot 0 (0.0/s)
Dec 07 2015 13:20:39 GMT: INFO (scan): (scan.c::944) starting aggregation scan job 4881460079378992570 {ccr-aws-prod:productGroupLevel} priority 2
Dec 07 2015 13:20:39 GMT: WARNING (as): (signal.c::161) SIGSEGV received, aborting Aerospike Community Edition build 3.6.4 os el6

This happens no matter we run this aggregation from aql, Java or C# client.

The problem occurs on Amazon AWS with your image https://aws.amazon.com/marketplace/pp/B00LW9382A/ Manually upgrading to 3.6.4 did not resolve the issue.

The same UDF running on a local server with the same data runs just fine.

Would be great if you could point me to some steps to isolate the source of the issue.

TIA


#2

While browsing through other SIGSEGV issues in the forum, I came across the suggestion to disable UDF cache.

After disabling UDF cache, we did not meet this problem again, so obviously the cache corruption issue is still not resolved with version 3.6.4


#3

Running Aerospike in gdb can provide some input. One immediate thought is difference in stack size between the platforms.