Index in state 'D'


#1

I have index in state ‘D’ for a long time, Can’t delete it or recreate. What is the ‘D’ state about and how to change it to RW state?

aql> show indexes

±-----------±-------±----------------±---------±------±---------------------------±-----------±-------------+

| ns | bins | set | num_bins | state | indexname | sync_state | type |

±-----------±-------±----------------±---------±------±---------------------------±-----------±-------------+

| “blindpew” | “mask” | “users” | 1 | “D” | “index_users_mask” | “synced” | “INT SIGNED” |

±-----------±-------±----------------±---------±------±---------------------------±-----------±-------------+

2 rows in set (0.001 secs)

OK

aql> DROP INDEX blindpew.index_users_mask;

Error: (201) AEROSPIKE_ERR_INDEX_NOT_FOUND

aql> CREATE INDEX index_users_mask ON blindpew.users (mask) NUMERIC;

Error: (200) AEROSPIKE_ERR_INDEX_FOUND


#2

Hi visergey,

There could be few reasons for this behavior -

  1. You had lot of data in secondary index. And it is actually taking time to delete.

  2. Index is stuck in a zombie state. We have seen this scenario due to a bug.

To confirm, -

  1. Can you share your aerospike server version.
  2. What type of operations are you doing on index and how frequent ?
  3. Can you share the the last 10000 lines of aerospike log file with us ?

Thanks


#3

I think it’s stuck in zombie state – because another index on the same-size table already recreated weeks ago.

Build 3.3.21 (community edition, have ent. license, not installed)

For last two month no operations were performed. But before that scenario was something like that: - issue create index command to make sure index exists - aggregate records on server using index that covers all table records (i.e. iterate over table using index)

Last lines are like this:

Apr 27 2015 08:50:55 GMT: INFO (drv_ssd): (drv_ssd.c::2359) device /opt/aerospike/data/blindpew.2: used 47990612096, contig-free 48733M (48733 wblocks), swb-free 16, n-w 0, w-q 0 w-tot 3491410 (1.1/s), defrag-q 0 defrag-tot 3496438 (1.1/
Apr 27 2015 08:50:56 GMT: INFO (drv_ssd): (drv_ssd.c::2359) device /opt/aerospike/data/blindpew: used 47990617216, contig-free 48591M (48591 wblocks), swb-free 16, n-w 0, w-q 0 w-tot 3493139 (1.0/s), defrag-q 0 defrag-tot 3498019 (1.0/s)
Apr 27 2015 08:50:57 GMT: INFO (info): (thr_info.c::4488)  system memory: free 20566844kb ( 20 percent free )
Apr 27 2015 08:50:57 GMT: INFO (info): (thr_info.c::4495)  migrates in progress ( 0 , 0 ) ::: ClusterSize 1 ::: objects 311226947
Apr 27 2015 08:50:57 GMT: INFO (info): (thr_info.c::4503)  rec refs 311257037 ::: rec locks 1 ::: trees 0 ::: wr reqs 0 ::: mig tx 0 ::: mig rx 0
Apr 27 2015 08:50:57 GMT: INFO (info): (thr_info.c::4509)  replica errs :: null 0 non-null 0 ::: sync copy errs :: node 0 :: master 0
Apr 27 2015 08:50:57 GMT: INFO (info): (thr_info.c::4519)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (103, 144659, 144556) : hb (0, 0, 0) : fab (16, 16, 0)
Apr 27 2015 08:50:57 GMT: INFO (info): (thr_info.c::4521)    heartbeat_received: self 0 : foreign 0
Apr 27 2015 08:50:57 GMT: INFO (info): (thr_info.c::4522)    heartbeat_stats: bt 0 bf 0 nt 0 ni 0 nn 0 nnir 0 nal 0 sf1 0 sf2 0 sf3 0 sf4 0 sf5 0 sf6 0 mrf 0 eh 0 efd 0 efa 0 um 0 mcf 0 rc 0
Apr 27 2015 08:50:57 GMT: INFO (info): (thr_info.c::4535)    tree_counts: nsup 1 scan 0 batch 0 dup 0 wprocess 0 migrx 0 migtx 0 ssdr 0 ssdw 0 rw 0
Apr 27 2015 08:50:57 GMT: INFO (info): (thr_info.c::4551) namespace blindpew: disk inuse: 95981229312 memory inuse: 57153340313 (bytes) sindex memory inuse: 4547802678 (bytes) avail pct 39
Apr 27 2015 08:50:57 GMT: INFO (info): (thr_info.c::4576)    partitions: actual 4096 sync 0 desync 0 zombie 0 wait 0 absent 0
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::137) histogram dump: reads (3923779808 total) msec
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::154)  (00: 3921439974) (01: 0000039148) (02: 0000140757) (03: 0000421412)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::154)  (04: 0000722708) (05: 0000479885) (06: 0000247357) (07: 0000158558)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::154)  (08: 0000075537) (09: 0000035340) (10: 0000013367) (11: 0000005421)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::163)  (12: 0000000342) (13: 0000000002)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::137) histogram dump: writes_master (6590996323 total) msec
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::154)  (00: 6588973665) (01: 0000141169) (02: 0000269510) (03: 0000414929)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::154)  (04: 0000649460) (05: 0000325796) (06: 0000118307) (07: 0000063988)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::154)  (08: 0000023781) (09: 0000010286) (10: 0000003777) (11: 0000001533)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::163)  (12: 0000000116) (13: 0000000006)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::137) histogram dump: proxy (0 total) msec
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::137) histogram dump: writes_reply (6590996323 total) msec
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::154)  (00: 6588973894) (01: 0000141041) (02: 0000269516) (03: 0000414922)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::154)  (04: 0000649411) (05: 0000325750) (06: 0000118304) (07: 0000063986)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::154)  (08: 0000023781) (09: 0000010286) (10: 0000003777) (11: 0000001533)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::163)  (12: 0000000116) (13: 0000000006)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::137) histogram dump: udf (3449567272 total) msec
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::154)  (00: 3442094849) (01: 0003279377) (02: 0002105575) (03: 0002066646)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::154)  (04: 0000010682) (05: 0000004657) (06: 0000002832) (07: 0000001846)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::154)  (08: 0000000565) (09: 0000000186) (10: 0000000041) (11: 0000000015)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::163)  (12: 0000000001)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::137) histogram dump: query (6 total) msec
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::154)  (00: 0000000001) (04: 0000000001) (05: 0000000002) (07: 0000000001)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::163)  (19: 0000000001)
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::137) histogram dump: query_rec_count (6 total) count
Apr 27 2015 08:50:57 GMT: INFO (info): (hist.c::163)  (00: 0000000005) (18: 0000000001)

Previous lines are pretty same, no warnings or errors


#4

Hi visergy,

Thanks for the information. This version of the server had a bug which we have fixed in the later versions. We are sorry for the trouble.

To get out of this situation you can try following methods -

  1. Upgrade the server to latest version. (Recommended)
  2. Restart the server. (All the zombie indices will be recreated.)

Please do tell if you need some other help in this matter.

Thanks


#5

I am having the same problem. Was originally on 3.5.12 (Enterprise Edition). I have upgraded to 3.6.0 (Enterprise Edition) with completely new servers and this problem has not gone away.

One of the boxes has the state as “D” and the rest still mark the index as “RW”

I am also getting “Namespace not found” errors when trying to re-delete the index.


#6

Hi Abramovic,

Did you run any of the following commands

  1. sindex-repair

  2. sindex-describe

Can you also check the following warnings in the log file

  1. as_record_unpickle_replace: bad format

  2. unpickle record ran beyond input

  3. udf_aerospike_setbin: Internal Error

Thanks