Disk usage per set

Hi,

I have multiple sets per namspace. How can I find out the disk usage per namespace?

Thanks!

@meher can you help here?

Thanks!

I assume you meant ‘disk usage per set rather then per namespace’. There is no direct statistic for the disk usage per set. But, as of Aerospike Version 3.6.1, the objsz histogram is also available at the set level, which can then be used to derive the approximate disk space used.

For example:

asinfo -v "hist-dump:ns=<NAMESPACE>;set=<SET>;hist=objsz"

Details on the objsz histogram on the asinfo commands page.

The obj-size-hist-max configuration parameter can be adjusted if necessary. If records are showing up in the right most bucket, then resolution has to be changed (through obj-size-hist-max) to cover larger records (while losing ‘resolution’).

Hope this is helpful.

@meher,

Thanks for your reply. Are you saying that I should just do a sum of bucket_size * num_records_in_bucket across all buckets? And do that for all nodes in the cluster (or multiply by num nodes for rough estimate)? How accurate is the output of that (considering it uses buckets and not real sizes)? Does it include replica objects or just masters?

Thanks!

Yes, you would have to do that math. The histogram with the default resolution of obj-size-hist-max of 100 gives you the exact numbers, since in that case, the bucket width would be 1 rblock (128 bytes) which is the smallest size increment used by records. But as you increase the obj-size-hist-max, you would lose on the resolution and it will become an approximation.

Good question regarding master/replica. This actually only covers masters. So you would have to multiply this by the replication factor, which then makes this an approximation in all cases, even with the smallest / default resolution.

1 Like