Hi,
I have a strange problem writing data to the aerospike cluster
aql> insert into storebig.Chunks (PK,Data) values ('5cb138284d431abd6a053a56625ec088bfb88912', '1234567890')
OK, 1 record affected.
aql> select * from storebig.Chunks where PK = '5cb138284d431abd6a053a56625ec088bfb88912'
Error: (2) AEROSPIKE_ERR_RECORD_NOT_FOUND
aql> insert into storebig.Chunks (PK,Data) values ('5cb138284d431abd6a053a56625ec088bfb88912', '1234567890')
Error: (1) AEROSPIKE_ERR_SERVER
Same story with the golang client library (of course)
It is very possible cluster is not healfy - some strange messages appears in the server(s) log:
May 06 2015 12:17:49 GMT: WARNING (drv_ssd): (drv_ssd.c::1236) read: read wrong key: expecting de6f0bc93bfdf560 got 8ad3dd7fce1ac7ec
May 06 2015 12:17:49 GMT: WARNING (drv_ssd): (drv_ssd.c::1236) read: read wrong key: expecting de6f0bc93bfdf560 got 8ad3dd7fce1ac7ec
May 06 2015 12:17:50 GMT: WARNING (drv_ssd): (drv_ssd.c::1230) read: bad block magic offset 29843600384
May 06 2015 12:17:50 GMT: WARNING (drv_ssd): (drv_ssd.c::1230) read: bad block magic offset 29843600384
My question is: what can I do to investigate the situation, debug and recover? Where to look and what to try?
Thank you.
With best regards, Daniel Podolsky
UPDATE
config template (actual config generated from this template on docker container start)
service {
user root
group root
paxos-single-replica-limit 1
pidfile /var/run/aerospike/asd.pid
service-threads 4
transaction-queues 4
transaction-threads-per-queue 4
proto-fd-max 15000
}
logging {
file /storage/logs/aerospike.log {
context any info
}
console {
context any info
}
}
network {
service {
address <%=os.getenv("NODE_EXT_ADDR")%>
port 3000
}
fabric {
address <%=os.getenv("NODE_INT_ADDR")%>
port 3001
}
heartbeat {
mode multicast
address 239.1.99.2
port 9918
interface-address <%=os.getenv("NODE_INT_ADDR")%> interval 150
timeout 10
}
info {
address <%=os.getenv("NODE_INT_ADDR")%>
port 3003
}
}
namespace storebig {
replication-factor 3
memory-size <%=os.getenv("MEM_USE_BIG")%>K
default-ttl 0
high-water-disk-pct 98
high-water-memory-pct 98
stop-writes-pct 95
storage-engine device {
file /storage/data/big.dat
filesize 3T
data-in-memory false
}
}
namespace storefast {
replication-factor 3
memory-size <%=os.getenv("MEM_USE_FAST")%>K
default-ttl 0
high-water-disk-pct 98
high-water-memory-pct 98
stop-writes-pct 95
storage-engine device {
file /storage/data/fast.dat
filesize <%=os.getenv("MEM_USE_FAST")%>K
data-in-memory true
}
}
namespace storetest {
replication-factor 3
memory-size <%=os.getenv("MEM_USE_FAST")%>K
default-ttl 0
high-water-disk-pct 98
high-water-memory-pct 98
stop-writes-pct 95
storage-engine device {
file /storage/data/test.dat
filesize 3T
data-in-memory false
}
}