Im using Aerospike to store events from 2 Kafka Topics, on production, each of topic would be with 100-300k TPS; However, in dev environment, I only sampling around 40mio events each, so in total i would insert 80mio events to Aerospike
In Aerospike, each events will go to separated Sets, imitating original topic; I know that for 1 record, the size will be roughly 64B for Primary Index and 850B for Data;
Then, i set my config with several points
service {
proto-fd-max 40000
cluster-name cakery
}
namespace bar {
replication-factor 1
# Enable nsup default expiration for namespace
# By default is 0, and no records with ttl > 0 can be written
nsup-period 900
storage-engine device {
file /opt/aerospike/data/bar.dat
filesize 20G
}
}
I also commented the namespace ‘test’, so practically i only use ‘bar’
Im using Aerospike 8 Community Edition
My VM is around 16Core and 16GB RAM and using SSD, i believe this is not big deal because it is not in production nor i do performance test;
Upon insertion, when the records is around 3-4mio each, i notice on ‘top’ command, that the memory used is 13GB;
My confusion and my questions are:
- By stating storage-engine device, Aerospike should be automatically store index on memory and data on disk right?
- Therefore, the memory occupancy should around 2 sets * (64B * 4mio records) which is under 13GB;
- Stopping Aerospike server is not releasing the memory, only after i DELETED the .dat file, the memory goes back to buff/cache
Why does this mechanism happens? How do i ended with much bigger resources that theoritically needed? And What configurations should i have if i want Index on Memory and Data on disk?