Adding new namespace fails


#1

Earlier i had the following configuration for namespace

namespace Namespace1 { replication-factor 2 memory-size 32G default-ttl 30d # 30 days, use 0 to never expire/evict.

storage-engine device { # shouldn’t mount these devices device /dev/sdb device /dev/sdc

# ssd optimizations
scheduler-mode noop
write-block-size 128K

} }

Then added 2 more namespaces and also edited my previous namespace

namespace Namespace1 { replication-factor 2 memory-size 20G default-ttl 30d high-water-memory-pct 80 high-water-disk-pct 50

storage-engine device { # shouldn’t mount these devices device /dev/sdb device /dev/sdc

# ssd optimizations
scheduler-mode noop
write-block-size 128K

} }

namespace Namespace2 { replication-factor 2 memory-size 28G default-ttl 90d # 30 days, use 0 to never expire/evict. high-water-memory-pct 60 high-water-disk-pct 50 stop-writes-pct 90 storage-engine device { # shouldn’t mount these devices device /dev/sdb device /dev/sdc # ssd optimizations scheduler-mode noop write-block-size 128K } }

namespace Namespace3 { replication-factor 2 memory-size 3G default-ttl 0 high-water-memory-pct 60 high-water-disk-pct 50 stop-writes-pct 90 storage-engine device { # shouldn’t mount these devices device /dev/sdb device /dev/sdc # ssd optimizations scheduler-mode noop write-block-size 128K } }

When i start the node I am getting following error

Sep 16 2014 16:52:42 GMT: WARNING (drv_ssd): (drv_ssd.c::2518) read header: device /dev/sdb previous namespace Namespace1 now Namespace2, check config or erase device Sep 16 2014 16:52:42 GMT: CRITICAL (drv_ssd): (drv_ssd.c:ssd_load_devices:3447) unable to read disk header /dev/sdb: No such file or directory

When i reverted to the old config, it started working again (ie, only with Namespace1)

I have 3 nodes in my cluster of 3 i2.2xlarge AMZ instances. I tried the setting only on 1 node and when i saw failure i stopped the rollout.

Is there something wrong the in the way I am adding namespace? AFAIK adding the new namespace section in conf and restarting should do the trick.

More Log :

Sep 16 2014 16:52:42 GMT: INFO (drv_ssd): (drv_ssd.c::3287) load device start: device /dev/sdb Sep 16 2014 16:52:42 GMT: INFO (drv_ssd): (drv_ssd.c::3684) usable device size must be header size 1048576 + multiple of 1048576, rounding down Sep 16 2014 16:52:42 GMT: INFO (drv_ssd): (drv_ssd.c::3287) load device start: device /dev/sdc Sep 16 2014 16:52:42 GMT: INFO (drv_ssd): (drv_ssd.c::3292) In TID 21138: Using arena #150 for loading data for namespace "Namespace1" Sep 16 2014 16:52:42 GMT: INFO (drv_ssd): (drv_ssd.c::3764) Opened device /dev/sdb bytes 800164151296 Sep 16 2014 16:52:42 GMT: INFO (drv_ssd): (drv_ssd.c::3292) In TID 21139: Using arena #150 for loading data for namespace "Namespace1" Sep 16 2014 16:52:42 GMT: WARNING (drv_ssd): (drv_ssd.c::3631) storage: couldn’t open /sys/block/sdb/queue/scheduler, did not set scheduler mode: No such file or directory Sep 16 2014 16:52:42 GMT: INFO (drv_ssd): (drv_ssd.c::3684) usable device size must be header size 1048576 + multiple of 1048576, rounding down Sep 16 2014 16:52:42 GMT: INFO (drv_ssd): (drv_ssd.c::3764) Opened device /dev/sdc bytes 800164151296 Sep 16 2014 16:52:42 GMT: WARNING (drv_ssd): (drv_ssd.c::3631) storage: couldn’t open /sys/block/sdc/queue/scheduler, did not set scheduler mode: No such file or directory Sep 16 2014 16:52:42 GMT: INFO (drv_ssd): (drv_ssd.c::1013) number of wblocks in allocator: 6104768 wblock 131072 Sep 16 2014 16:52:42 GMT: INFO (drv_ssd): (drv_ssd.c::1013) number of wblocks in allocator: 6104768 wblock 131072 Sep 16 2014 16:52:42 GMT: WARNING (drv_ssd): (drv_ssd.c::2518) read header: device /dev/sdb previous namespace Namespace1 now Namespace2, check config or erase device Sep 16 2014 16:52:42 GMT: CRITICAL (drv_ssd): (drv_ssd.c:ssd_load_devices:3447) unable to read disk header /dev/sdb: No such file or directory


#2

Looks like your are reusing the same exact ssd’s and parttitions for all 3 namespace.

This will not work as a partition should only belong to one namespace.

You could divide your disk into 3 partitions (ie : sdb1, sdb2, sdb3 and sdc1, sdc2, sdc3) or go with separate drives per namespace.

More info on ssd’s can be found at:

http://www.aerospike.com/docs/operations/plan/ssd/

–Lucien


#3

1)Devices should be unique across namespaces. You can use partitions instead if you dont have enough unique devices.

  1. Also, when you are repurposing a disk from one namespace to another, you need to wipeout the disk (dd if=/dev/zero of=/dev/destination/disk) before using it with the new namespace.

#4

Also, please note that for adding and removing a namespace, you will have to restart entire cluster. For production sanity, the entire cluster has to updated and restarted together. Else the machines with different namespace configurations will not be part of same cluster till namespace definition is same across all machines of the desired cluster.


#5

what will happen if i don’t wipe out the disk completely using dd, will the new dataset get corrupted ? (I know it would fail until i clear the headers, so lets suppose i wiped out partially at the head.) or the new dataset will simply overwrite the new dataset?

The reason for asking is, wiping out the disk takes lot of time and when new EC2 nodes are added I am curious whether i can skip wiping out?


#6

If you don’t wipe the complete disk, older remaining data can comeback upon next cold start.

On EC2, you should be doing pre-warming of the disks (other than ephemeral disks on R,I and H instance).

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html#disk-performance

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-prewarm.html


#7

We only use ephemeral. So I need to wipe only if i am wiping older namespace. A fresh install on EC2 I2.2xlarge with only ephemeral should be good without wiping i suppose. Am i right?


#8

hi,

Yes, if you are using fresh install on EC2 using i2, then you dont need to wipe the disks.

For instances other than I, R and H, AWS recommends pre-warming the ephemeral ones for performance…

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html#disk-performance


#9

The ability to add/remove namespaces with rolling restart of the cluster is added starting server version 3.13.0.1 for both Community and Enterprise versions on paxos-protocol version v5.

[AER-3485] - (KVS) Support adding/removing namespaces with rolling restart. http://www.aerospike.com/download/server/notes.html#3.13.0.1