Cache miss in benchmarking against Redis

Please take a look at the question here…

I had posted it in github, but was suggested to write it here.

Thanks,

Hello,

Can anyone here help with this?

Can you elaborate what each of your messages mean

“Aerospike Connection failed”

Is it that your client is not able to make connection to aerospike at all ??

“No bin with such key in LDT”

is it that you have successfully inserted key:value into LLIST but when you trying to get it back you do not find it ??

– R

Hi,

Thanks for replying.

“Aerospike Connection failed” - I am triggering this when the connection is created and the isLDT() function call fails.

“No bin with such key in LDT” means there was a cache miss.

This cache is already created.

These errors don’t come all the time. I ran the script in jmeter with 50 threads and 600 loops (30K times). It threw an error in 2% of cases. Redis always got a cache hit.

Items in cache - 6M

You can check my gist at Benchmark script · GitHub

It gives the main part of my benchmark script.

Hi Pushpesh,

I’m looking at your gist, and I believe it’s referring to an Aerospike class you’ve written.

$obj = new Base\Aerospike( 'test', 'mytest', 1 );
$ldt = $obj->createLDT( 'List', 'sbin' );
if( $ldt->isLDT( 'List' ) ) {

I’m not sure what Base\Aerospike is, so I assume that’s your class. We don’t have a createLDT() method, and our isLDT() does not take an argument. Can you show me the source code to your class?

In the GitHub repo, examples/ldt/llist.php has examples of how LList works. The first element you add to an LList will cause it to be created, at which point LList::isLDT() should return true unless some temporary communication error happened. If isLDT() returns a false you can call LList::error() and LList::errorno() to find what happened. Perhaps you should log those so we have more debugging information.

I hope you’re using a client release >= 3.3.10 as that has up-to-date LDT operations. We’re close to releasing 3.3.11, as well.

Hi Ronen,

$obj = new Base\Aerospike( 'test', 'mytest', 1 );

We are using namespaces, so the wrapper that I have written is in another namespace, that’s why Base\Aerospike. I did not want my class name (Aerospike) clash with the default one provided by Aerospike.

createLDT is a method in my wrapper class which takes 2 params.

  1. Name of LDT whom to instantiate (List is passed here to create an object of LList).
  2. The bin name.

This is all working fine. Just that it fails sometimes (either to detect the LDT or to find the key, both exist). I am not sure why.

I understand, I was just curious what the code in your class is, but if you think it’s trivial then it’s not necessary.

Please handle the unexpected false return value from isLDT() by logging the error message and code, as I suggested. Perhaps that will show what is going on. Thanks for helping debug the problem.

Hi @rbotzer

Almost all the errors have to do with “timeouts”.

Case 1

Trying to check if bin is LDT.

Error code - 9.

Error message - AEROSPIKE_ERR_TIMEOUT.

Case 2

Trying to find key in llist bin.

Error code - 9.

Error message - AEROSPIKE_ERR_TIMEOUT.

How do I handle the timeout’s and what is the default value / is there any way to increase it.

Thanks, Pushpesh

Hi Pushpesh,

You can control the timeout configuration for the PHP client through global and constructor options.

For example:

$config = ["hosts"=>[["addr"=>"12.34.56.78", "port"=>3000]]];
$opts = [Aerospike::OPT_READ_TIMEOUT => 1500, 
              Aerospike::OPT_WRITE_TIMEOUT => 1500];
$db = new Aerospike($config, true, $opts);
$key = $db->initKey('test', 'demo', 1234);
$llist = new \Aerospike\LDT\LList($db, $key, 'my_ldt_bin');

Or in your config file (let’s assume it’s /etc/php/conf.d/aerospike.ini):

aerospike.read_timeout=1500
aerospike.write_timeout=1500

The LDT classes use the apply() method which uses the OPT_WRITE_TIMEOUT policy. The classes will be enhanced to allow for easier configuration from their constructor.

Hi Ronen,

That is helpful, thanks.

But, I think the default read/write timeout should suffice. Is there any specific reason why the cache read time would exceed the default timeout(1 sec)?

Let me know.

Thanks

I’m not sure yet, most often it is a configuration issue, or hardware related. If it is in the PHP client I would really like to get down to the bottom of it. It’s not something I’ve run into on our end, yet.

Is the machine you’re running these tests the same as the one you’ve described in the 'input/output error for a configuration’ forum post?

Pushpesh,

How does system vitals in top, iostat etc look like ?

– R

yes, they are on the same machine.

PFA the screenshot

Pushpesh,

If client and server are running on the same box. There seems to be no process power left after the load you are running. Look at 0% idle. This could be possible reason Aerospike is not able to keep up and requests are timing out …

– R

@pushpesh4u does the problem repeat if you move JMeter off the server’s box to a separate machine?

Hi Ronen,

The JMeter is running on a separate machine.

I am getting “AEROSPIKE_ERR_RECORD_BUSY” error when trying to search/read one particular value in the bin.

@pushpesh4u can you distribute your work load so that you dont create a hot key situation as per error

@dhaval How do I do that.

Right now, in test, I have 1 aerospike instance, 3 clients.

So all requests are going to that 1 instance only. The data is saved on ssd.

1 Like

@pushpesh4u

how close is this scenario to production scenario?

You can have load running for multiple keys to avoid key busy error.

Is your storage on EBS? That can also slow down iops.