"Error Code 8: Server memory error"

Hi,

We are using aerospike

root@rgsp-datami-as-saopaulo-1a1{~}# asinfo -v version
Aerospike Community Edition build 3.16.0.6

We are noticing following error:

{“das”:{“key”:{“namespace”:“das”,“setName”:“pkgOverallAcc”,“digest”:“iDcO4tObCyY+20fauXVwvS7O5yw=”,“userKey”:{“object”:“tpe-sig-campaign”,“type”:3}},“operations”:[{“type”:“ADD”,“binName”:“nsp_t”,“value”:{“object”:93157,“type”:1}},{“type”:“ADD”,“binName”:“nsp_d”,“value”:{“object”:62108,“type”:1}},{“type”:“ADD”,“binName”:“sig_t”,“value”:{“object”:93157,“type”:1}},{“type”:“ADD”,“binName”:“sig_d”,“value”:{“object”:62108,“type”:1}},{“type”:“READ”,“binName”:“t”,“value”:{“object”:null,“type”:0}},{“type”:“ADD”,“binName”:“nfg_t”,“value”:{“object”:93157,“type”:1}},{“type”:“ADD”,“binName”:“nfg_d”,“value”:{“object”:62108,“type”:1}},{“type”:“MAP_MODIFY”,“binName”:“nfg_dos”,“value”:{“object”:“AEmTqANhbmRyb2lkzgABa+UA”,“type”:4}},{“type”:“MAP_MODIFY”,“binName”:“nsp_dos”,“value”:{“object”:“AEmTqANhbmRyb2lkzgABa+UA”,“type”:4}},{“type”:“MAP_MODIFY”,“binName”:“nsp_app”,“value”:{“object”:“AEmTuANjb21fYmJ2YV9ueHRfcGVydS83XzVfMc4AAWvlAA==”,“type”:4}}],“resultCode”:8,“exception”:{“cause”:null,“stackTrace”:[{“methodName”:“parseResult”,“fileName”:“ReadCommand.java”,“lineNumber”:111,“className”:“com.aerospike.client.command.ReadCommand”,“nativeMethod”:false},{“methodName”:“execute”,“fileName”:“SyncCommand.java”,“lineNumber”:84,“className”:“com.aerospike.client.command.SyncCommand”,“nativeMethod”:false},{“methodName”:“operate”,“fileName”:“AerospikeClient.java”,“lineNumber”:1268,“className”:“com.aerospike.client.AerospikeClient”,“nativeMethod”:false},{“methodName”:“operate”,“fileName”:“AerospikeRepository.java”,“lineNumber”:120,“className”:“com.datami.repository.AerospikeRepository”,“nativeMethod”:false},{“methodName”:“updateUsage”,“fileName”:“PackageOverallAccountingDAOAerospike.java”,“lineNumber”:40,“className”:“com.datami.dao.zmi.accounting.PackageOverallAccountingDAOAerospike”,“nativeMethod”:false},{“methodName”:“updateAccounting”,“fileName”:“PackageOverallAccountingFacade.java”,“lineNumber”:30,“className”:“com.datami.zmi.accounting.facade.PackageOverallAccountingFacade”,“nativeMethod”:false},{“methodName”:“executeBolt”,“fileName”:“PackageOverallAccountingBolt.java”,“lineNumber”:54,“className”:“com.datami.zmi.accounting.topology.PackageOverallAccountingBolt”,“nativeMethod”:false},{“methodName”:“execute”,“fileName”:“CustomBaseRichBolt.java”,“lineNumber”:47,“className”:“com.datami.storm.topology.CustomBaseRichBolt”,“nativeMethod”:false},{“methodName”:“invoke”,“fileName”:“executor.clj”,“lineNumber”:729,“className”:“org.apache.storm.daemon.executor$fn__5030$tuple_action_fn__5032”,“nativeMethod”:false},{“methodName”:“invoke”,“fileName”:“executor.clj”,“lineNumber”:461,“className”:“org.apache.storm.daemon.executor$mk_task_receiver$fn__4951”,“nativeMethod”:false},{“methodName”:“onEvent”,“fileName”:“disruptor.clj”,“lineNumber”:40,“className”:“org.apache.storm.disruptor$clojure_handler$reify__4465”,“nativeMethod”:false},{“methodName”:“consumeBatchToCursor”,“fileName”:“DisruptorQueue.java”,“lineNumber”:482,“className”:“org.apache.storm.utils.DisruptorQueue”,“nativeMethod”:false},{“methodName”:“consumeBatchWhenAvailable”,“fileName”:“DisruptorQueue.java”,“lineNumber”:460,“className”:“org.apache.storm.utils.DisruptorQueue”,“nativeMethod”:false},{“methodName”:“invoke”,“fileName”:“disruptor.clj”,“lineNumber”:73,“className”:“org.apache.storm.disruptor$consume_batch_when_available”,“nativeMethod”:false},{“methodName”:“invoke”,“fileName”:“executor.clj”,“lineNumber”:848,“className”:“org.apache.storm.daemon.executor$fn__5030$fn__5043$fn__5096”,“nativeMethod”:false},{“methodName”:“invoke”,“fileName”:“util.clj”,“lineNumber”:484,“className”:“org.apache.storm.util$async_loop$fn__557”,“nativeMethod”:false},{“methodName”:“run”,“fileName”:“AFn.java”,“lineNumber”:22,“className”:“clojure.lang.AFn”,“nativeMethod”:false},{“methodName”:“run”,“fileName”:“Thread.java”,“lineNumber”:745,“className”:“java.lang.Thread”,“nativeMethod”:false}],“resultCode”:8,“inDoubt”:false,“message”:“Error Code 8: Server memory error”,“localizedMessage”:“Error Code 8: Server memory error”,“suppressed”:},“level”:“ERROR”,“timestamp”:“2019-11-27T08:37:30Z”,“component”:“das”,“msgType”:“Aerospike exception”,“suppressed”:}}

This is because low_device_percent value has gone down.

I have increased fragmentation from 50 to 55.

root@rgsp-datami-as-saopaulo-1a1{~}# grep defrag /etc/aerospike/aerospike.conf
    defrag-lwm-pct 55

How can i fix this. We do have 3 nodes. one node recovered after instance restart. remaining 2 nodes still throwing this issue.

Thanks, Datami DevOps Team.

What troubleshooting have you done? Why are you on 3.16? What does iostat look like? Whats your defrag-sleep at?

Hi Albot,

As mentioned we are using community edition and we didnt upgrade to 4.x. All three nodes started stating writes disrupted as low_device_percent reached less than 5%. now i have configured:

evict-tenths-pct 10 defrag-lwm-pct 55 defrag-startup-minimum 5

in aerospike.conf and restarting service one instance at a time. now all three instances are in sync.

How can we increase the low_device_percent value.

What’s your defrag sleep set to? Can you capture screenshot of iostat -zmxty 60 on an affected system? Can you also grab last few lines of defrag-q per device from Aerospike log?

Could you run:

asadm -e "info"
asadm -e "summary"