The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.
Symptom
Customer is finding “INFO: task NAME:PID blocked for more than 120 seconds” in
/var/log/messages
or dmesg
. This indicates that the process requested a
blocking resource such as a disk and wasn’t able to be fulfilled for more than 120
seconds and subsequently abandoned by the kernel. There are two known situations
where this has been observed with Aerospike.
- Aerospike segfaulted resulting in a very large coredump being flushed to disk.
- More than
vm.dirty_ratio
dirty disk pages exist in RAM and are being flushed to disk.
For Aerospike Segfault
Check /var/log/aerospike/aerospike.log
for log lines originating from
signal.c
. Or in the event that a stack trace is not printed look for evidence
that the server shutdown such as config being printed from the (config)
context without message indicating shutdown from signal.c
or the (as)
context.
For Flushing of Dirty Disk Pages
For this situation the process should be “readahead.cron”.
INFO: task readahead.cron:12101 blocked for more than 120 seconds.
By default Linux will use up to vm.dirty_ratio
of available memory for dirty
file cache pages. These are pages that for performance purposes, are not
immediately flushed to disk. In CentOS 6, the default is 20. After this mark has
been reached the file system flushes all outstanding writes to disk causing all
following IOs going synchronous. For modern personal workstations with more than
4GB RAM this is a problem… For production servers with 128GB or more RAM this
is a horrible problem! The subsequent writes will cause a huge IO spike driving
up write latencies. In the case here the IO subsystem is not fast enough to
flush the data withing 120 seconds. This especially happens on systems with a
lot of memory.
This has been fixed in recent kernels but again older kernels have this issue. We need to check into this.
vm.dirty_ratio is 20 for our test nodes.
Instead of using vm.dirty_ratio
production boxes should use
vm.dirty_bytes = 209715200
and vm.dirty_background_bytes = 104857600
which
is 200MiB and 100MiB respectively. Setting vm_dirty_bytes
and
vm.dirty_background_bytes
automatically sets the ratio counterparts to 0
(disabled).
To configure these changes open /etc/sysctl.conf
and comment any reference to
vm.dirty_bytes
, vm.dirty_ratio
, vm.dirty_background_bytes
, and
vm.dirty_background_ratio
and then add the recommended values above to the end
of the file.
...
vm.dirty_bytes = 209715200
vm.dirty_background_bytes = 104857600
After saving the change run sudo sysctl -p