Missing records when query with secondary index

Shumin_Guo · January 9, 2017, 3:13am

I built secondary index for my data but when i try to test it using some sample data, i can’t get the correct result when querying from AQL. The result can be returned when querying with primary key. The issue might be similar to the one discussed here.

pgupta · January 9, 2017, 6:23am

Can you provide steps to reproduce your problem? Sample record, number of records, number of nodes, secondary index on what bin, query… etc. Hard to figure out otherwise.

mariaefi29 · August 10, 2020, 8:47am

Hello everyone and @pgupta

Unfortunately, we have exactly the same problem.

Some records (quite a lot) can’t be found on secondary index, however, we see them while querying on pk.

We have 4339837 in a set, 2 nodes, secondary index on one bin (bin_of_secondary_index in an example). All indexes (on each node) have RW state Example (I changed the names of bins and set not to show our info):

aql> select * from cache.out_set where pk='test_pk';

[
    [
        {
          "id": "id",
          "bin_of_secondary_index": "test_secondary",
          "extra_bin": []
        }
    ],
    [
        {
          "Status": 0
        }
    ]
]

aql> select * from cache.out_set where bin_of_secondary_index='test_secondary';

[
    [
    ],
    [
        {
          "Status": 0
        }
    ]
]

Could you please figure out what can be the problem? Do you need more information?

asinfo -v 'statistics'

cluster_size=2;cluster_key=CBD50A8E5288;cluster_integrity=true;cluster_is_member=true;uptime=4746692;system_free_mem_pct=93;system_swapping=false;heap_allocated_kbytes=1584282;heap_active_kbytes=1613020;heap_mapped_kbytes=1950208;heap_efficiency_pct=81;objects=4340030;sub_objects=0;tombstones=0;tsvc_queue=2;info_queue=0;delete_queue=0;rw_in_progress=0;proxy_in_progress=0;tree_gc_queue=0;client_connections=227;heartbeat_connections=1;fabric_connections=24;heartbeat_received_self=0;heartbeat_received_foreign=31788534;reaped_fds=11380;info_complete=20016613;proxy_retry=0;demarshal_error=0;early_tsvc_client_error=1;early_tsvc_batch_sub_error=0;early_tsvc_udf_sub_error=0;batch_index_initiate=2857351881;batch_index_queue=0:0,0:0,0:0,0:0,0:0,0:0,0:0,0:0,1:1,1:0,0:0,0:0;batch_index_complete=2857351879;batch_index_error=0;batch_index_timeout=0;batch_index_unused_buffers=179;batch_index_huge_buffers=0;batch_index_created_buffers=180;batch_index_destroyed_buffers=0;batch_initiate=0;batch_queue=0;batch_error=0;batch_timeout=0;scans_active=0;query_short_running=0;query_long_running=0;sindex_ucgarbage_found=0;sindex_gc_locktimedout=356;sindex_gc_inactivity_dur=0;sindex_gc_activity_dur=0;sindex_gc_list_creation_time=817239784;sindex_gc_list_deletion_time=0;sindex_gc_objects_validated=1531562010887;sindex_gc_garbage_found=0;sindex_gc_garbage_cleaned=0;paxos_principal=BB9B733F66B1FAC;migrate_allowed=true;migrate_partitions_remaining=0;fabric_bulk_send_rate=0;fabric_bulk_recv_rate=0;fabric_ctrl_send_rate=0;fabric_ctrl_recv_rate=0;fabric_meta_send_rate=0;fabric_meta_recv_rate=0;fabric_rw_send_rate=0;fabric_rw_recv_rate=0

pgupta · August 10, 2020, 4:35pm

aql is not the way to run long running SI queries that will return a lot of records back. its an application written in C to help develop your data model - its runs an infinite loop, grabbing your next command and making a C api call. on long running queries, it cuts off the output after 1 second. (set timeout 10000 for e.g. - 10 seconds- to increase the output) because you don’t want to tie up the terminal if you happen to run a select * on a trillion records. you can increase the aql timeout setting and see more output and will find your record. you should really write your own application in one of the client languages that meets your requirement and consume records in a call back as they come.

mariaefi29 · August 11, 2020, 5:16am

Dear @pgupta,

Thank you for your reply!

Yes, of course we run everything in an application. We have an app written in Go. I needed to do some simple migrations and stumbled upon this case. I knew that I have 353 records on the particular value of the secondary index, however, after migration I got 0! That’s why I started digging deeper and used aql to get a hold of it. Through aql I realised that I have data, but I can’t access a lot of records through our secondary index. And I attached the aql script to just demonstrate the problem. I had to use different methods to migrate avoiding the use of secondary index and it went OK.

However, obviously there is a problem here. Could you please investigate why the behaviour of secondary index might be broken?

pgupta · August 11, 2020, 9:52pm

What do you mean by “migrate”? Migrate from where to where?

mariaefi29 · August 12, 2020, 2:42pm

By migrations I mean to change some data in a set. In my case I needed to add one more field in a map in each record which indicates how much related records are in another set.

So I have written my script in Go and it used querying upon the secondary index and the result was zero. However, I know for sure that the data is there, cause I was able to access it through PK in aql and in my script as well.

What strikes is that not all results were zeros. Some were ok! That zero helped me to notice that there is a real problem here with the index.

pgupta · August 12, 2020, 6:07pm

The aql test you listed earlier … can you repeat that with one more command at the end:

aql> select * from cache.out_set where pk=‘test_pk’;

aql> select * from cache.out_set where bin_of_secondary_index=‘test_secondary’;

aql> show indexes

mariaefi29 · August 12, 2020, 6:36pm

oh, we dropped the index already, because it turned out we don’t need it and cause it couldn’t be used in the migrations. I apologise! However, when I was debugging, show indexes command showed me the normal indexes on each node! With RW status

Topic		Replies	Views
Secondary index missing records after migration (add/remove node) Query & Indexing secondary , index	3	1415	March 24, 2017
Not getting PK when making a query using Secondary Index Operations	4	1148	August 2, 2020
Secondary index missing records, can it be repaired? Configuration secondary , index	9	4086	February 26, 2023
AQL: Query using secondary index AQL secondary , index	1	2053	December 22, 2014
Corrupted Index Query & Indexing	0	1927	February 16, 2016

Missing records when query with secondary index

Related topics