Error -1: Failed to seed cluster when using aql CLI tool


#1

I can’t seem to get the aql command line tool working after I customize aerospike.conf on a fresh Ubuntu instance with Aerospike Community 3.5.

Problem: When running the aql CLI tool on Ubuntu, I see the following error:

root@test:~/aerospike-server-community-3.5.12-ubuntu12.04# aql
WARN AEROSPIKE_ERR_CLIENT Socket write error: 111 
Error -1: Failed to seed cluster

Environment:

Distributor ID:    Ubuntu
Description:    Ubuntu 14.04.2 LTS
Release:    14.04

Box is a fresh Rackspace VPS with 1GB ram, 1 GHz CPU.

Aerospike version is aerospike-server-community-3.5.12-ubuntu12.04

To Replicate:

Step 1. On a fresh Ubuntu instance, install Aerospike using the following directions then run ‘aql’ to test that the tool works: http://www.aerospike.com/docs/operations/install/linux/ubuntu/

Output:

wget -O aerospike.tgz 'http://aerospike.com/download/server/latest/artifact/ubuntu12'
tar -xvf aerospike.tgz
cd cd aerospike-server-community-3.5.12-ubuntu12.04/
./asinstall
service aerospike start

service aerospike start
 * Start aerospike:  asd                                                        kernel.shmall too low, setting to 4G pages = 16TB
kernel.shmall = 4294967296
kernel.shmmax too low, setting to 1GB
kernel.shmmax = 1073741824

root@test:~/aerospike-server-community-3.5.12-ubuntu12.04# aql
Aerospike Query
Copyright 2013 Aerospike. All rights reserved.

aql>

Step 2: Edit the default /etc/aerospike/aerospike.conf and insert the following namespace configuration:

namespace default {
        replication-factor 2
        memory-size 1G
        default-ttl 0 

        storage-engine device {
                file /opt/aerospike/data/default.dat
                filesize 2T
                data-in-memory true
        }
}

Step 3: Restart aerospike and try the aql tool again (everything seems to work fine at this point)

Output:

root@test:~/aerospike-server-community-3.5.12-ubuntu12.04# service aerospike restart
 * Halt aerospike:  asd                                                  [ OK ] 
 * Start aerospike:  asd                                                 [ OK ] 
 
root@test:~/aerospike-server-community-3.5.12-ubuntu12.04# aql
Aerospike Query
Copyright 2013 Aerospike. All rights reserved.

aql> 

Step 4: Restart aerospike again and then running aql will exit mysteriously with an error

Output:

root@test:~/aerospike-server-community-3.5.12-ubuntu12.04# service aerospike restart
 * Halt aerospike:  asd                                                  [ OK ] 
 * Start aerospike:  asd                                                 [ OK ] 
root@test:~/aerospike-server-community-3.5.12-ubuntu12.04# aql
2015-06-05 08:59:32 WARN AEROSPIKE_ERR_CLIENT Socket write error: 111
Error -1: Failed to seed cluster

Note that I’ve been able to reproduce this many times on new server instances. Any ideas?


#2

Thanks for the detailed steps. After the restart at Step 4, can you check to see the server is fully up, and not still in the middle of initializing? The way to check is by looking at the server log line, and see a line like the following:

Jun 22 2014 03:35:33 GMT: INFO (as): (as.c::376) service ready: soon there will be cake!

Only when the server is fully initialized, will the port (3000) be open for service.


#3

Ah! That was it. Thanks for the fast reply. When restarting aerospike, it says ‘stop’ ok then ‘start’ ok so I figured it was all good to go. Then I had to wait 6 or so seconds to use aql.

Now here’s the key question: if I had, say a webapp reliant on this and I restarted, I guess I’d have to wait until that ‘cake’ message until I could safely start my webapp, correct? If that’s the case then devs should be warned for connection issues if they attempt to connect to soon IMO.

Thanks,


#4

Nick, if you had a web app in production you would be using at least a 2-node cluster (for replication and HA) and do rolling restarts on them. Take one down, do whatever you needed (upgrade software, hardware, etc), while the cluster kept serving your application. Then you’d bring the node back up, and do the same for the next node in the cluster.


#5

Yeah this was for a small staging server but that’s a good rec for prod. Thank you for the great support.


#6

Hi Nicholas,

Thanks for the compliments! By the way, we appreciate your recent Slideshare presentation on “Analytics with Go & Aerospike”. Do continue to share!

Cheers,

Maud

Twitter: @Mnemaudsyne