Disk usage appears inconsistent when viewed from df and asinfo when files are used as a storage engine

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

#Issue:

When a file is being used for persistence, viewing disk usage from the operating system using the command ‘df’ may give a different result to the disk usage shown with the asinfo command.

As an example, a 540 Gb disk may show as 93% full via ‘df -h’ but only show as 60% used when the ‘asinfo’ command is used to check usage. Adding extra nodes to the cluster would not change the output from ‘df -h’

#Solution:

The explanation for this is that Aerospike has its own storage mechanism / file system. When a namespace is configured to be stored within a file the space will be allocated at that time regardless of whether the namespace contains data. This means that the disk usage from an OS point of view will show that the entire space is utilised. Asinfo will give a correct view of disk usage as it will show how much of the disk is being used to contain data. The output from asinfo will change to reflect nodes being added to the cluster.

When raw devices are being used for persistence nothing will be shown via df as no filesystem is mounted.

For this reason, to monitor Aerospike cluster disk usage accurately, Aerospike commands such as asinfo and asadm should be used rather than operating system commands.

#Note

To view information on raw devices the following commands are available.

  • hdparm - included in most linux distributions, used to describe and configure SATA / IDE devices
  • parted - Supplied by GNU under GPL; can be used to show unallocated space on a raw device.