 FAQ How to determine storage per set.

Note - This article is applicable for server versions 4.2.0.2 and above. For calculating disk storage for a set on versions prior to that, refer to the `hist-dump` info command. The important difference between this and the new version is the number of buckets (always 100 previously) and the granularity (previously configured by obj-size-hist-max).

Context

To determine the approximate storage size for a set on a cluster you will need to compute the values returned by the `object-size-linear` histogram.

Storage on Disk - Method for persisted data

Here is the notation to compute the approximate set size (over estimation):

``````       1024
Σ            num_records_in_bucket_n * (bucket_width * n)
n=1
``````

And for the under estimation:

``````       1024
Σ            num_records_in_bucket_n * (bucket_width * (n-1))
n=1
``````
• There are always 1024 buckets.
• The `bucket_width` is computed as the `hist-width` divided by 1024 (number of buckets).
• The `hist-width` is equal to the configured `write-block-size`.
• In the special but typical case of a 1MiB `write-block-size`, the bucket_width is `1,048,576 / 1,024 = 1,024`.
• For a 1MiB `write-block-size`, records in the first bucket are of size between 0 and 1024, in the second bucket, of size between 1025 and 2048, etc…

Perform the calculation for approximating the amount of storage per set on a cluster

The following example provides the steps for calculating the storage per set on a cluster, when using the default `data-in-memory` false.

Note: This calculation accounts for both master and replica records. Therefore, there is no need to account for the replication-factor.

Step 1: Generate the object-size-linear histogram

On a single node issue the following info command:

``````asinfo -v "histogram:namespace=<namespaceName>;type=object-size-linear;set=<setName>;"
``````

Sample Output:

``````\$ asinfo -v "histogram:namespace=test;type=object-size-linear;set=demo;"
units=bytes:hist-width=1048576:bucket-width=1024:buckets=281537970,56726976,21515544,11172775,6825716,455
3921,3216092,2351975,1770540,1355049,1054638,830128,660217,530114,429404,350591,288626,237583,199826,1667
39,140463,118291,100674,85691,73716,63303,54493,47219,40791,35587,30958,27093,23996,20805,18376,16452,145
41,12866,11324,10238,9123,8088,7312,6578,6026,5260,4826,4554,4093,3613,3275,3150,2699,2603,2316,2160,1909
,1801,1618,1543,1436,1253,1191,1096,1016,972,901,826,706,704,640,616,535,509,458,467,425,379,369,313,321,
303,260,230,223,198,213,171,197,173,154,154,147,127,109,122,89,104,112,84,82,96,69,72,67,64,46,64,54,53,4
5,43,63,41,38,33,35,38,20,26,31,18,31,23,22,17,18,30,16,22,19,22,9,18,7,10,16,10,10,4,11,16,12,8,10,7,6,1
1,7,6,7,2,11,8,8,6,7,4,3,3,4,3,5,2,2,4,8,4,3,3,7,2,2,1,1,3,1,1,2,3,1,0,3,2,4,2,1,2,2,1,5,0,1,0,0,2,0,1,0,
0,0,2,1,3,0,0,0,0,2,1,0,2,2,0,1,2,0,0,0,1,1,0,0,0,1,2,1,0,0,0,0,0,2,0,1,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0,0,1
,0,0,0,1,3,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
``````

Step 2: Compute the approximate set size (over estimate)

Using the output from the object-size-linear histogram compute the approximate set size.

n goes from 1 to bucket_width (e.g. 1024) which is:

``````  "num_records_in_bucket_1"(281537970) * (1024*1)
+ "num_records_in_bucket_2"(56726976)  * (1024*2)
+ "num_records_in_bucket_3"(21515544)  * (1024*3)
...
``````

The following is a breakdown of the calculation:

``````281537970*(1024*1)+56726976*(1024*2)+21515544*(1024*3)+11172775*(1024*4)+6825716*(1024*5)+4553921*(1024*6)
+3216092*(1024*7)+2351975*(1024*8)+1770540*(1024*9)+1355049*(1024*10)+1054638*(1024*11)+830128*(1024*12)+
660217*(1024*13)+530114*(1024*14)+429404*(1024*15)+350591*(1024*16)+288626*(1024*17)+237583*(1024*18)+199
826*(1024*19)+166739*(1024*20)+140463*(1024*21)+118291*(1024*22)+100674*(1024*23)+85691*(1024*24)+73716*
(1024*25)+63303*(1024*26)+54493*(1024*27)+47219*(1024*28)+40791*(1024*29)+35587*(1024*30)+30958*(1024*31)
+27093*(1024*32)+23996*(1024*33)+20805*(1024*34)+18376*(1024*35)+16452*(1024*36)+14541*(1024*37)+12866*
(1024*38)+11324*(1024*39)+10238*(1024*40)+9123*(1024*41)+8088*(1024*42)+7312*(1024*43)+6578*(1024*44)+6026
*(1024*45)+5260*(1024*46)+4826*(1024*47)+4554*(1024*48)+4093*(1024*49)+3613*(1024*50)+3275*(1024*51)+3150
*(1024*52)+2699*(1024*53)+2603*(1024*54)+2316*(1024*55)+2160*(1024*56)+1909*(1024*57)+1801*(1024*58)+1618
*(1024*59)+1543*(1024*60)+1436*(1024*61)+1253*(1024*62)+1191*(1024*63)+1096*(1024*64)+1016*(1024*65)+972*
(1024*66)+901*(1024*67)+826*(1024*68)+706*(1024*69)+704*(1024*70)+640*(1024*71)+616*(1024*72)+535*(1024*73)
+509*(1024*74)+458*(1024*75)+467*(1024*76)+425*(1024*77)+379*(1024*78)+369*(1024*79)+313*(1024*80)+321*
(1024*81)+303*(1024*82)+260*(1024*83)+230*(1024*84)+223*(1024*85)+198*(1024*86)+213*(1024*87)+171*(1024*88)
+197*(1024*89)+173*(1024*90)+154*(1024*91)+154*(1024*92)+147*(1024*93)+127*(1024*94)+109*(1024*95)+122*
(1024*96)+89*(1024*97)+104*(1024*98)+112*(1024*99)+84*(1024*100)+82*(1024*101)+96*(1024*102)+69*(1024*103)
+72*(1024*104)+67*(1024*105)+64*(1024*106)+46*(1024*107)+64*(1024*108)+54*(1024*109)+53*(1024*110)+45*
(1024*111)+43*(1024*112)+63*(1024*113)+41*(1024*114)+38*(1024*115)+33*(1024*116)+35*(1024*117)+38*(1024*118)
+20*(1024*119)+26*(1024*120)+31*(1024*121)+18*(1024*122)+31*(1024*123)+23*(1024*124)+22*(1024*125)+17*
(1024*126)+18*(1024*127)+30*(1024*128)+16*(1024*129)+22*(1024*130)+19*(1024*131)+22*(1024*132)+9*(1024*133)
+18*(1024*134)+7*(1024*135)+10*(1024*136)+16*(1024*137)+10*(1024*138)+10*(1024*139)+4*(1024*140)+11*
(1024*141)+16*(1024*142)+12*(1024*143)+8*(1024*144)+10*(1024*145)+7*(1024*146)+6*(1024*147)+11*(1024*148)
+7*(1024*149)+6*(1024*150)+7*(1024*151)+2*(1024*152)+11*(1024*153)+8*(1024*154)+8*(1024*155)+6*(1024*156)+
7*(1024*157)+4*(1024*158)+3*(1024*159)+3*(1024*160)+4*(1024*161)+3*(1024*162)+5*(1024*163)+2*(1024*164)+2*
(1024*165)+4*(1024*166)+8*(1024*167)+4*(1024*168)+3*(1024*169)+3*(1024*170)+7*(1024*171)+2*(1024*172)+2*
(1024*173)+1*(1024*174)+1*(1024*175)+3*(1024*176)+1*(1024*177)+1*(1024*178)+2*(1024*179)+3*(1024*180)+1*
(1024*181)+0*(1024*182)+3*(1024*183)+2*(1024*184)+4*(1024*185)+2*(1024*186)+1*(1024*187)+2*(1024*188)+2*
(1024*189)+1*(1024*190)+5*(1024*191)+0*(1024*192)+1*(1024*193)+0*(1024*194)+0*(1024*195)+2*(1024*196)+0*
(1024*197)+1*(1024*198)+0*(1024*199)+0*(1024*200)+0*(1024*201)+2*(1024*202)+1*(1024*203)+3*(1024*204)+0*
(1024*205)+0*(1024*206)+0*(1024*207)+0*(1024*208)+2*(1024*209)+1*(1024*210)+0*(1024*211)+2*(1024*212)+2*
(1024*213)+0*(1024*214)+1*(1024*215)+2*(1024*216)+0*(1024*217)+0*(1024*218)+0*(1024*219)+1*(1024*220)+1*
(1024*221)+0*(1024*222)+0*(1024*223)+0*(1024*224)+1*(1024*225)+2*(1024*226)+1*(1024*227)+0*(1024*228)+0*
(1024*229)+0*(1024*230)+0*(1024*231)+0*(1024*232)+2*(1024*233)+0*(1024*234)+1*(1024*235)+0*(1024*236)+1*
(1024*237)+0*(1024*238)+0*(1024*239)+0*(1024*240)+0*(1024*241)+1*(1024*242)+0*(1024*243)+0*(1024*244)+0*
(1024*245)+1*(1024*246)+0*(1024*247)+0*(1024*248)+0*(1024*249)+0*(1024*250)+0*(1024*251)+1*(1024*252)+0*
(1024*253)+0*(1024*254)+0*(1024*255)+1*(1024*256)+3*(1024*257)+0*(1024*258)+0*(1024*259)+0*(1024*260)+0*
(1024*261)+0*(1024*262)+0*(1024*263)+0*(1024*264)+0*(1024*265)+0*(1024*266)+0*(1024*267)+0*(1024*268)+0*
(1024*269)+0*(1024*270)+1*(1024*271)+1*(1024*272)+0*(1024*273)+0*(1024*274)+0*(1024*275)+0*(1024*276)+0*
(1024*277)+1*(1024*278)
...
...
...
+0*(1024*1018)+0*(1024*1019)+0*(1024*1020)+0*(1024*1021)+0*(1024*1022)+0*(1024*1023)+0*(1024*1024)
``````

Total: 750,354,164,736 bytes ~= 698.82 GiB

Step 3: Compute the approximate set size for the cluster

Repeat step 1 & step 2 for each node in the cluster, then add each individual node totals to calculate the storage per set for the whole cluster.

Step 4: Under estimation of the set size

Repeat the process but shifting the bucket sizes by 1 (first line would always be 0 of course):

``````  "num_records_in_bucket_1"(281537970) * (1024*0)
+ "num_records_in_bucket_2"(56726976)  * (1024*1)
+ "num_records_in_bucket_3"(21515544)  * (1024*2)
...
``````

Storage in Memory - Method for data-in-memory true

Memory used by a set can be calculated by summing up the following 3 areas of usage:

• Primary Index - each index entry uses 64 bytes in memory as defined in our Capacity Planning document.
• Data Usage - memory used for storing the record values (since we are here configured with data-in-memory true or storage-engine memory) - This would depend on the size of the records in the set.
• Secondary Index - refer to the Secondary index Capacity planning documentation as well as the memory_used_sindex_bytes statistic.
• Note: this statistic is for the entire namespace, thus we would need to calculate further on how many records of the particular set would be part of the secondary index memory footprint. The following example assumes no secondary index.

Perform the calculation for approximating the amount of storage per set on a cluster

The following example provides the steps for calculating the storage per set on a cluster, when specifying `data-in-memory` true. For `data-in-memory` false, calculating the primary index memory footprint and secondary index (if applicable) is sufficient.

Note: This calculation accounts for both master and replica records. Therefore, there is no need to account for the replication-factor.

Step 1: Generate the object count and memory_data_bytes

On a single node, to generate the set information for all nodes in the cluster, issue the following asadm command:

``````asinfo -v 'sets' -l"
ns=test:set=demo:objects=4478299:tombstones=0:memory_data_bytes=0:truncate_lut=274395528189:stop-writes-count=0:set-enable-xdr=use-default:disable-eviction=false
ns=bar:set=testset:objects=99823:tombstones=0:memory_data_bytes=1397522:truncate_lut=0:stop-writes-count=0:set-enable-xdr=use-default:disable-eviction=false
``````

To get the set statistics from all nodes in the cluster, issue the following asadm command:

``````asadm -e "asinfo -v 'sets' -l"
``````

Note: The `demo` set of the `test` namespace value `memory_data_bytes=0` indicates data-in-memory false.

The number of objects can be found in the `objects` value.

In the following sample output, the set `testset` of the namespace `bar` on node1, the number of objects is 99823 and the number of bytes of memory for the data is 1397522 bytes.

``````ns=bar:set=testset:objects=99823:tombstones=0:memory_data_bytes=1397522:truncate_lut=0:stop-writes-count=0:set-enable-xdr=use-default:disable-eviction=false
``````

The amount of bytes of memory used to store the data can be found in the `memory_data_bytes` value.

Step 2: Compute the set size for the cluster

Adding the Primary Index total usage (number of objects multiplied by 64 bytes) and the `memory_data_bytes` usage returns the amount of Memory used for the set.

``````Total: ((Primary Index) + memory_data_bytes) = Memory Used
Total: ((99823*64     ) + 1397522          ) = 7,786,194 Bytes
``````

Repeat step 1 & step 2 for each node in the cluster, then add each individual node totals to calculate the storage per set for the whole cluster. When on a 2 node cluster with replication factor 2, the numbers will match since each node will hold all the data (master and replica).

Step 3: Percentage used by a set

For calculating the percentages used by a set against the current total used for a namespace and the total allocated memory for a namespace, the following values are required:

• Total namespace allocated memory:

To retrieve the configured `memory-size`, issue the following asadm command:

``````asadm -e "asinfo -v 'namespace/<namespaceName>' like memory-size"
``````

Sample Output (for a 10G configured namespace):

``````\$ asadm -e "asinfo -v 'namespace/bar' like memory-size"
node1.aerospike.com:3000 (172.17.0.1) returned:
memory-size=10737418240

node2.aerospike.com:3000 (172.17.0.2) returned:
memory-size=10737418240
``````

In this sample output, for the namespace bar, the allocated memory-size is 10737418240 bytes (10 GiB).

• Total namespace used memory:

To retrieve the total used memory `memory_used_bytes`, issue the following asadm command:

``````asadm -e "show statistics for bar like memory_used_bytes"
``````
``````~~~~~~~bar Namespace Statistics (2019-06-11 23:25:27 UTC)~~~~~~~
NODE             :   172.17.0.1:3000   172.17.0.2:3000
memory_used_bytes:   1572543873        1572543873
``````

In this sample output, for the namespace bar, the used memory is 1572543873 bytes (~1.46GiB)

Keywords

STORAGE SIZE SET HISTOGRAM CLUSTER MEMORY PERCENT

June 5 2019