Process Blocked for More Than 120 Seconds

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

Symptom

Customer is finding “INFO: task NAME:PID blocked for more than 120 seconds” in /var/log/messages or dmesg. This indicates that the process requested a blocking resource such as a disk and wasn’t able to be fulfilled for more than 120 seconds and subsequently abandoned by the kernel. There are two known situations where this has been observed with Aerospike.

  1. Aerospike segfaulted resulting in a very large coredump being flushed to disk.
  2. More than vm.dirty_ratio dirty disk pages exist in RAM and are being flushed to disk.

For Aerospike Segfault

Check /var/log/aerospike/aerospike.log for log lines originating from signal.c. Or in the event that a stack trace is not printed look for evidence that the server shutdown such as config being printed from the (config) context without message indicating shutdown from signal.c or the (as) context.

For Flushing of Dirty Disk Pages

For this situation the process should be “readahead.cron”.

INFO: task readahead.cron:12101 blocked for more than 120 seconds.

By default Linux will use up to vm.dirty_ratio of available memory for dirty file cache pages. These are pages that for performance purposes, are not immediately flushed to disk. In CentOS 6, the default is 20. After this mark has been reached the file system flushes all outstanding writes to disk causing all following IOs going synchronous. For modern personal workstations with more than 4GB RAM this is a problem… For production servers with 128GB or more RAM this is a horrible problem! The subsequent writes will cause a huge IO spike driving up write latencies. In the case here the IO subsystem is not fast enough to flush the data withing 120 seconds. This especially happens on systems with a lot of memory.

This has been fixed in recent kernels but again older kernels have this issue. We need to check into this.

vm.dirty_ratio is 20 for our test nodes.

Instead of using vm.dirty_ratio production boxes should use vm.dirty_bytes = 209715200 and vm.dirty_background_bytes = 104857600 which is 200MiB and 100MiB respectively. Setting vm_dirty_bytes and vm.dirty_background_bytes automatically sets the ratio counterparts to 0 (disabled).

To configure these changes open /etc/sysctl.conf and comment any reference to vm.dirty_bytes, vm.dirty_ratio, vm.dirty_background_bytes, and vm.dirty_background_ratio and then add the recommended values above to the end of the file.

...
vm.dirty_bytes = 209715200
vm.dirty_background_bytes = 104857600

After saving the change run sudo sysctl -p

2 Likes