Context about three, somewhat commonly seen, Aerospike errors

We are using the Node.js implementation of Aerospike.

Here are the versions we are using in production:

asd --version

Aerospike Enterprise Edition, build 3.5.8

npm ls | grep aerospike

Aerospike Node Client, version 1.0.50

===

Here are errors we are seeing being returned from our production aerospike clusters:

Error 1

{
    code: 14,
    message: 'AEROSPIKE_ERR_RECORD_BUSY',
    func: 'as_command_parse_result',
    file: 'src/main/aerospike/as_command.c',
    line: 916,
    query: {}
}

Error 2

{
    code: 9,
    message: 'Client timeout: timeout=1000 iterations=1 failedNodes=0 failedConns=0',
    func: 'as_command_execute',
    file: 'src/main/aerospike/as_command.c',
    line: 446
}

Error 3

{
    code: -1,
    message: 'Bad file descriptor',
    func: 'as_socket_read_limit',
    file: 'src/main/aerospike/as_socket.c',
    line: 413,
    input: {
        ns: 'Graph',
        set: 'SLT',
        key: '%14426123659267653m8s1804416150636',
        digest: <Buffer f6 ea a2 18 ab 51 70 cb 83 9a 3f 02 d2 69 50 d4 9e 05 83 a3>
    }
}

Is there any way to get more context as to why these errors are happening and how we can address/fix/prevent them?

Here is our aerospike.conf:

service {
        user root
        group root
        paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1.
        pidfile /var/run/aerospike/asd.pid
        service-threads 4
        transaction-queues 4
        transaction-threads-per-queue 4
        proto-fd-max 15000
}
logging {
        # Log file must be an absolute path.
        file /var/log/aerospike/aerospike.log {
                context any info
        }
}
network {
        service {
                address any
                port 3000
        }
        heartbeat {
                mode mesh
                port 3002
                mesh-seed-address-port XX.XX.XXX.102 3002 # X-ed out 
                mesh-seed-address-port XX.XX.XXX.87  3002 # X-ed out
                interval 150
                timeout 10
        }
        fabric {
                port 3001
        }
        info {
                port 3003
        }
}
namespace Graph {
        replication-factor 2
        memory-size 100G
        storage-engine device {
               device /dev/sdb
               write-block-size 128k
       }
}

Here is a dump of df -k on one of our nodes:

admin_augur_io@aerospike-w9wx:~$ df -k
Filesystem                                                        1K-blocks                 Used Available Use% Mounted on
rootfs                                                             10319160              1484920   8310056  16% /
udev                                                                  10240                    0     10240   0% /dev
tmpfs                                                              10748608                  140  10748468   1% /run
/dev/disk/by-uuid/ac75b061-bf43-4298-9234-8a555ab0f9ac             10319160              1484920   8310056  16% /
tmpfs                                                                  5120                    0      5120   0% /run/lock
tmpfs                                                              21497200                    0  21497200   0% /run/shm
/dev/sdb                                               12540912287890076416 12540912286858158908         0 100% /aerospike

admin_augur_io@aerospike-w9wx:~$ cat /var/log/messages | grep sdb
Sep 20 20:50:08 aerospike-w9wx kernel: [13293360.096019] EXT4-fs (sdb): last error at time 705507539: :16843009: inode 729977170: block 282578800148737
Sep 21 20:51:55 aerospike-w9wx kernel: [13379867.616034] EXT4-fs (sdb): last error at time 705507539: :16843009: inode 729977170: block 282578800148737
Sep 22 20:53:43 aerospike-w9wx kernel: [13466375.136020] EXT4-fs (sdb): last error at time 705507539: :16843009: inode 729977170: block 282578800148737

And our other node:

admin_augur_io@aerospike-behq:~$ df -k
Filesystem                                                        1K-blocks                 Used Available Use% Mounted on
rootfs                                                             10319160              1428364   8366612  15% /
udev                                                                  10240                    0     10240   0% /dev
tmpfs                                                              10748608                  140  10748468   1% /run
/dev/disk/by-uuid/ac75b061-bf43-4298-9234-8a555ab0f9ac             10319160              1428364   8366612  15% /
tmpfs                                                                  5120                    0      5120   0% /run/lock
tmpfs                                                              21497200                    0  21497200   0% /run/shm
/dev/sdb                                               12540912287890076416 12540912286858159544         0 100% /aerospike

There was no mention of SDB in the second node’s /var/log/messages

Our Aero VMs use SSDs.

Really hoping to get some help on this,

So what we have discovered so far is that the client timeouts appear to be region specific in our load balancer. We are trying different policy settings within the Node module to see if we can reduce those errors. It seems after doing so, we are seeing those less and less in certain regions. Still occurs in Asia, however. Maybe we should increase the value of timeout a little further within our policies?

The bad file descriptor may be coming from inode 729977170: block 282578800148737 on our primary node, is there any way to confirm this?

Record busy seems to be a possible hot key issue, so we have been investigating that as much as we can. What are the rate limits for accessing / writing to keys?

Thanks in advance

SO,

We were able to find some documentation and tweak some things that helped.

  • We changed our Node module policy settings, i.e. increased timeouts for operations, reads and writes.

  • We changed some parameters in our config

    service { user root group root paxos-single-replica-limit 1 # Number of nodes where the replica count is automatically reduced to 1. pidfile /var/run/aerospike/asd.pid service-threads 16 # increased to 16 transaction-queues 16 # increased to 16 transaction-threads-per-queue 64 # increased to 64 transaction-pending-limit 100 # added this proto-fd-max 15000 } }

And so far it seems we no longer have AEROSPIKE_RECORD_BUSY or “Client Timeout” errors.

We still have a ton of “Bad file descriptors”. Is there ANY direction we can get on this? Has ANY body seen and successfully addressed/solved issues with “Bad file descriptors”?

It is seriously the only thing we need help with at this point.

Thanks in advance.

@brandante,

If you are using the Enterprise Edition, this means that you have access to our 24x7 help desk for support. Please call the help desk; I will send you a private email in a moment with the number.

Of course, you are free to continue to use the forum for your questions regardless - just be aware that it will be slower, and that you have other options.

Issue may be with your storage.

Sep 20 20:50:08 aerospike-w9wx kernel: [13293360.096019] EXT4-fs (sdb): last error at time 705507539: :16843009: inode 729977170: block 282578800148737
Sep 21 20:51:55 aerospike-w9wx kernel: [13379867.616034] EXT4-fs (sdb): last error at time 705507539: :16843009: inode 729977170: block 282578800148737
Sep 22 20:53:43 aerospike-w9wx kernel: [13466375.136020] EXT4-fs (sdb): last error at time 705507539: :16843009: inode 729977170: block 282578800148737

Seems like the /dev/sdb is an HDD and is configured with a filesystem (ext4)

We usally recommend using raw partitions and SSD disks.

Can you confirm the size of the /dev/sdb partition?

Is it greater than 2T?

If sdb is > 2T, you can create smaller raw partitions like sdb1,sdb2 and/or try with data-in-memory true

namespace Graph {
        replication-factor 2
        memory-size 100G
        storage-engine device {
               device /dev/sdb1
               write-block-size 128k
               data-in-memory true
       } }

Just for context, we did use Google Cloud’s “Click to Deploy” option, which apparently uses EXT4 as the default partition type for /sdb; maybe when we are ready to launch future nodes/clusters, we should modify the deploy/setup scripts to make sure to use a raw partition for /sdb?

That being said, here are the answers to your questions:

  • Can you confirm the size of the /dev/sdb partition? Each of our nodes are using 1TB sdb partitions
  • Is it greater then 2T? Nope, 1TB.