Disk usage appears inconsistent when viewed from df and asinfo when files are used as a storage engine


#1

#Issue:

When a file is being used for persistence, viewing disk usage from the operating system using the command ‘df’ may give a different result to the disk usage shown with the asinfo command.

As an example, a 540 Gb disk may show as 93% full via ‘df -h’ but only show as 60% used when the ‘asinfo’ command is used to check usage. Adding extra nodes to the cluster would not change the output from ‘df -h’

#Solution:

The explanation for this is that Aerospike has it’s own storage mechanism / file system. When a namespace is configured to be stored within a file the space will be allocated at that time regardless of whether the namespace contains data. This means that the disk usage from an OS point of view will show that the entire space is utilised. Asinfo will give a correct view of disk usage as it will show how much of the disk is being used to contain data. The output from asinfo will change to reflect nodes being added to the cluster.

When raw devices are being used for persistence nothing will be shown via df as no filesystem is mounted.

For this reason, to monitor Aerospike cluster disk usage accurately, Aerospike commands such as asinfo and asadm should be used rather than operating system commands.

#Note

To view information on raw devices the following commands are available.

  • hdparm - included in most linux distributions, used to describe and configure SATA / IDE devices
  • parted - Supplied by GNU under GPL; can be used to show unallocated space on a raw device.