Missing records when query with secondary index

I built secondary index for my data but when i try to test it using some sample data, i can’t get the correct result when querying from AQL. The result can be returned when querying with primary key. The issue might be similar to the one discussed here.

Can you provide steps to reproduce your problem? Sample record, number of records, number of nodes, secondary index on what bin, query… etc. Hard to figure out otherwise.

Hello everyone and @pgupta

Unfortunately, we have exactly the same problem.

Some records (quite a lot) can’t be found on secondary index, however, we see them while querying on pk.

We have 4339837 in a set, 2 nodes, secondary index on one bin (bin_of_secondary_index in an example). All indexes (on each node) have RW state Example (I changed the names of bins and set not to show our info):

aql> select * from cache.out_set where pk='test_pk';

[
    [
        {
          "id": "id",
          "bin_of_secondary_index": "test_secondary",
          "extra_bin": []
        }
    ],
    [
        {
          "Status": 0
        }
    ]
]

aql> select * from cache.out_set where bin_of_secondary_index='test_secondary';

[
    [
    ],
    [
        {
          "Status": 0
        }
    ]
]

Could you please figure out what can be the problem? Do you need more information?

asinfo -v 'statistics'

cluster_size=2;cluster_key=CBD50A8E5288;cluster_integrity=true;cluster_is_member=true;uptime=4746692;system_free_mem_pct=93;system_swapping=false;heap_allocated_kbytes=1584282;heap_active_kbytes=1613020;heap_mapped_kbytes=1950208;heap_efficiency_pct=81;objects=4340030;sub_objects=0;tombstones=0;tsvc_queue=2;info_queue=0;delete_queue=0;rw_in_progress=0;proxy_in_progress=0;tree_gc_queue=0;client_connections=227;heartbeat_connections=1;fabric_connections=24;heartbeat_received_self=0;heartbeat_received_foreign=31788534;reaped_fds=11380;info_complete=20016613;proxy_retry=0;demarshal_error=0;early_tsvc_client_error=1;early_tsvc_batch_sub_error=0;early_tsvc_udf_sub_error=0;batch_index_initiate=2857351881;batch_index_queue=0:0,0:0,0:0,0:0,0:0,0:0,0:0,0:0,1:1,1:0,0:0,0:0;batch_index_complete=2857351879;batch_index_error=0;batch_index_timeout=0;batch_index_unused_buffers=179;batch_index_huge_buffers=0;batch_index_created_buffers=180;batch_index_destroyed_buffers=0;batch_initiate=0;batch_queue=0;batch_error=0;batch_timeout=0;scans_active=0;query_short_running=0;query_long_running=0;sindex_ucgarbage_found=0;sindex_gc_locktimedout=356;sindex_gc_inactivity_dur=0;sindex_gc_activity_dur=0;sindex_gc_list_creation_time=817239784;sindex_gc_list_deletion_time=0;sindex_gc_objects_validated=1531562010887;sindex_gc_garbage_found=0;sindex_gc_garbage_cleaned=0;paxos_principal=BB9B733F66B1FAC;migrate_allowed=true;migrate_partitions_remaining=0;fabric_bulk_send_rate=0;fabric_bulk_recv_rate=0;fabric_ctrl_send_rate=0;fabric_ctrl_recv_rate=0;fabric_meta_send_rate=0;fabric_meta_recv_rate=0;fabric_rw_send_rate=0;fabric_rw_recv_rate=0

aql is not the way to run long running SI queries that will return a lot of records back. its an application written in C to help develop your data model - its runs an infinite loop, grabbing your next command and making a C api call. on long running queries, it cuts off the output after 1 second. (set timeout 10000 for e.g. - 10 seconds- to increase the output) because you don’t want to tie up the terminal if you happen to run a select * on a trillion records. you can increase the aql timeout setting and see more output and will find your record. you should really write your own application in one of the client languages that meets your requirement and consume records in a call back as they come.

Dear @pgupta,

Thank you for your reply!

Yes, of course we run everything in an application. We have an app written in Go. I needed to do some simple migrations and stumbled upon this case. I knew that I have 353 records on the particular value of the secondary index, however, after migration I got 0! :frowning: That’s why I started digging deeper and used aql to get a hold of it. Through aql I realised that I have data, but I can’t access a lot of records through our secondary index. And I attached the aql script to just demonstrate the problem. I had to use different methods to migrate avoiding the use of secondary index and it went OK.

However, obviously there is a problem here. Could you please investigate why the behaviour of secondary index might be broken?

What do you mean by “migrate”? Migrate from where to where?

By migrations I mean to change some data in a set. In my case I needed to add one more field in a map in each record which indicates how much related records are in another set.

So I have written my script in Go and it used querying upon the secondary index and the result was zero. However, I know for sure that the data is there, cause I was able to access it through PK in aql and in my script as well.

What strikes is that not all results were zeros. Some were ok! That zero helped me to notice that there is a real problem here with the index.

The aql test you listed earlier … can you repeat that with one more command at the end:

aql> select * from cache.out_set where pk=‘test_pk’;

aql> select * from cache.out_set where bin_of_secondary_index=‘test_secondary’;

aql> show indexes

oh, we dropped the index already, because it turned out we don’t need it and cause it couldn’t be used in the migrations. I apologise! However, when I was debugging, show indexes command showed me the normal indexes on each node! With RW status

© 2015 Copyright Aerospike, Inc. | All rights reserved. Creators of the Aerospike Database.