Missing indexes


#1

Hi,

Sometimes we observe the situation that after adding an index it disappears after some time (few days). The server is not rebooted during this period. We have to add the index again.

What could it be?

Indexes is numeric. We used server v3.5.2 on Centos 6.6. We do not see any SMD, WARNING or any other errors in the logs.


#4

Evictions should not cause index to go missing. Can you specify sequence of events ?

e.g

  1. Start a cluster
  2. Create index
  3. Run workload
  4. After few days index go missing

Is there anything missing in above sequence ?? Are you running on AWS ??

– R


#5

Sequence of events in right: start cluster, create indexes, run workload (~ 30qps on write, schedule by hour read from indexes), after few days indexes missed.

We used own servers (not AWS).


#6

How big is cluster and are there any node up down (e.g rolling upgrade) or network event (say cluster splitting up and reforming) or any such event.

Also how are you verifying the index has gone missing ??

–R


#7

At now - 4 nodes. Initially, there were 3 nodes. They have been updated from 3.4.1 to 3.5.2. And then was immediately added a fourth node. A few days after these actions, the problem arose. Downs nodes or other events did not occur.

In parallel, we have updated our second cluster on the same principle. There were no problems, although in the second cluster we have much more namespaces, indexes and workload.

Also how are you verifying the index has gone missing ??

Sorry, I do not quite understand the question. When requesting to index from client the data did not have (error “Error Code 11: Query failed because cluster is empty.”). AQL “show indexes” showed a lack of indexes.


#8

Related to Error Code 11. Query returns 11 only if it cannot talk to any of nodes in the cluster …

Code Snippet

if (node == null) {
    nodes = cluster.getNodes();
    if (nodes.length == 0) {
        throw new AerospikeException(ResultCode.SERVER_NOT_AVAILABLE, "Query failed because cluster is empty.");
    }
}

See if you are able to see the cluster nodes from the node where Application is running

About the indexes going missing just like that suddenly, there is no such known issue. Can you check to make sure if it has not been accidently deleted. You would see following kind of message in the log if such a thing has happend

 Mar 12 2015 09:00:34 GMT: INFO (info): (thr_info.c::6363)  Secondary index deletion called for ns:test si:ind1
 Mar 12 2015 09:00:34 GMT: INFO (info): (thr_info.c::6396) Index deletion request received for test:ind1 via SMD

Can you please check !!

– R


#9

Sorry, I’m confused, error code 11 does not apply to this issue. The client had no errors.

As for indexes, we did not find similar entries in the logs. Indexes not be accidentally deleted.

Could this problem be related with overwrite of sindex_module.smd on one of servers? For example, due to errors in data migration or something else?


#10

Migration has no correlation to the index metadata. Whenever cluster is formed the metadata state is picked based on majority state and with index being intact if both the splits are of equal size and one with index and other without index

Should not cause index to go missing. Can you publish .smd files on all the nodes …

– R


#11

Smd files: http://dropmefiles.com/MoFw5

Absolutely identical on all nodes.


#12

:slight_smile: This is weird. .smd is good there was no network or cluster activity and index cannot be found. Which index are you querying ?? e.g

aql > select * from dspBidsStatistic.DspBidsStatisticWithGeoFeatures where dts = 1

– R


#13

Yes. To clarify, “dts” is long.


#14

After full reinstalling, error does not occur. Subject is no longer relevant. Thanks.