I have system which accomodate balance of user using Aerospike, its using 5 server cluster and im using 3.15.0.2 . Its already running since 2020 and no problem at all.
Suddenly at 30 january 15.00 to 31 january 07.00 (the exact time is unknown) . one of record suddenly changed from 9775265 to 9280. i makes sure its not problem in application because no log request about that at all. My suspect is at that time 31 january 00.10 the server shutdown because datacenter power change.
My question is : What should i do to prevent this issue happened again incase there are another shutdown in the future?
Here’s server configuration :
# Aerospike database configuration file for use with systemd.
service {
paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
proto-fd-max 15000
# transaction-pending-limit 50
transaction-pending-limit 0
}
logging {
file /var/log/aerospike/aerospike.log {
context any info
}
}
network {
service {
address any
port 3000
}
heartbeat {
# mode multicast
# multicast-group 239.1.99.222
# port 9918
# To use unicast-mesh heartbeats, remove the 3 lines above, and see
# aerospike_mesh.conf for alternative.
mode mesh
address 192.168.0.221
port 3002 # Heartbeat port for this node.
# List one or more other nodes, one ip-address & port per line:
mesh-seed-address-port 192.168.0.221 3002
mesh-seed-address-port 192.168.0.222 3002
mesh-seed-address-port 192.168.0.223 3002
mesh-seed-address-port 192.168.0.224 3002
mesh-seed-address-port 192.168.0.232 3002
# mesh-seed-address-port 10.10.10.13 3002
# mesh-seed-address-port 10.10.10.14 3002
interval 150
timeout 40
}
fabric {
port 3001
}
info {
port 3003
}
}
namespace test {
replication-factor 2
memory-size 4G
default-ttl 30d # 30 days, use 0 to never expire/evict.
storage-engine memory
}
namespace bar {
replication-factor 2
memory-size 4G
default-ttl 30d # 30 days, use 0 to never expire/evict.
storage-engine memory
# To use file storage backing, comment out the line above and use the
# following lines instead.
# storage-engine device {
# file /opt/aerospike/data/bar.dat
# filesize 16G
# data-in-memory true # Store data in memory in addition to file.
# }
}
namespace billing_gateway_ns {
replication-factor 2
memory-size 28G
default-ttl 0d # 30 days, use 0 to never expire/evict.
# storage-engine memory
storage-engine device {
device /dev/disk/by-id/ata-Samsung_SSD_860_EVO_1TB_S4FMNE0M801945V
write-block-size 128K
}
}
Thank you before *my english is not that good, sorry