Node using RAW and file storage-devices does not start


#1

Node using RAW and file storage-devices does not start

Problem Description

A node that has been configured with some namespaces using file based storage-devices and others with RAW storage-devices fails on initial startup with the following message in the aerospike.log:

Nov 08 2018 07:49:00 GMT: WARNING (drv_ssd): (drv_ssd.c:3653) unable to open file /opt/namespacedata/ns_target.data: No space left on device  
Nov 08 2018 07:49:00 GMT: WARNING (drv_ssd): (drv_ssd.c:3727) {ns_target} can't initialize files  
Nov 08 2018 07:49:00 GMT: FAILED ASSERTION (storage): (storage.c:72) could not initialize storage for namespace ns_target  
Nov 08 2018 07:49:00 GMT: WARNING (as): (signal.c:209) SIGUSR1 received, aborting Aerospike Enterprise Edition build 4.2.0.4 os el6  
Nov 08 2018 07:49:00 GMT: WARNING (as): (signal.c:211) stacktrace: registers: rax 0000000000000000 rbx 0000000000000000 rcx 000014f1aaa8d55b rdx 000000000000000a rsi 0000000000003112 rdi 0000000000003112 rbp 00007ffca859a850 rsp 00007ffca859a328 r8 0000000000000000 r9 000014f1a983f16d r10 00007ffca8599da0 r11 0000000000000206 r12 00007ffca859a350 r13 0000000000000001 r14 0000000002629b3c r15 0000000002629a68 rip 000014f1aaa8d55b  
Nov 08 2018 07:49:00 GMT: WARNING (as): (signal.c:211) stacktrace: found 8 frames: 0x4a8534 0x14f1aaa8d690 0x14f1aaa8d55b 0x594892 0x5210c8 0x469411 0x14f1a9814445 0x453b99 offset 0x400000  
Nov 08 2018 07:49:00 GMT: WARNING (as): (signal.c:211) stacktrace: frame 0: /usr/bin/asd(as_sig_handle_usr1+0x10c) [0x4a8534]  
Nov 08 2018 07:49:00 GMT: WARNING (as): (signal.c:211) stacktrace: frame 1: /lib64/libpthread.so.0(+0xf690) [0x14f1aaa8d690]  
Nov 08 2018 07:49:00 GMT: WARNING (as): (signal.c:211) stacktrace: frame 2: /lib64/libpthread.so.0(raise+0x2b) [0x14f1aaa8d55b]  
Nov 08 2018 07:49:00 GMT: WARNING (as): (signal.c:211) stacktrace: frame 3: /usr/bin/asd(cf_fault_event+0x216) [0x594892]  
Nov 08 2018 07:49:00 GMT: WARNING (as): (signal.c:211) stacktrace: frame 4: /usr/bin/asd(as_storage_init+0x8e) [0x5210c8]  
Nov 08 2018 07:49:00 GMT: WARNING (as): (signal.c:211) stacktrace: frame 5: /usr/bin/asd(main+0x2ef) [0x469411]  
Nov 08 2018 07:49:00 GMT: WARNING (as): (signal.c:211) stacktrace: frame 6: /lib64/libc.so.6(__libc_start_main+0xf5) [0x14f1a9814445]  
Nov 08 2018 07:49:00 GMT: WARNING (as): (signal.c:211) stacktrace: frame 7: /usr/bin/asd() [0x453b99]

The node does not start.

Explanation

Though the final error is shown in the excerpt from aerospike.log above, and looks like an out of space error, the sequence of events leading up to this error is further back in the same log file. If we look at the storage-engine stanzas (redacted for clarity) for each namespace we see:

ov 08 2018 07:48:59 GMT: INFO (config): (cfg.c:3860) namespace ns_sag {
Nov 08 2018 07:48:59 GMT: INFO (config): (cfg.c:3860) storage-engine device {
Nov 08 2018 07:48:59 GMT: INFO (config): (cfg.c:3860) device /dev/nvme0n1p1
Nov 08 2018 07:48:59 GMT: INFO (config): (cfg.c:3860) }
Nov 08 2018 07:48:59 GMT: INFO (config): (cfg.c:3860) namespace ns_flap {
Nov 08 2018 07:48:59 GMT: INFO (config): (cfg.c:3860) storage-engine device {
Nov 08 2018 07:48:59 GMT: INFO (config): (cfg.c:3860) device /dev/nvme0n1p3
 08 2018 07:48:59 GMT: INFO (config): (cfg.c:3860) }
Nov 08 2018 07:48:59 GMT: INFO (config): (cfg.c:3860) namespace ns_target {
Nov 08 2018 07:48:59 GMT: INFO (config): (cfg.c:3860) storage-engine device {
Nov 08 2018 07:48:59 GMT: INFO (config): (cfg.c:3860) file /opt/namespacedata/ns_target.data
Nov 08 2018 07:48:59 GMT: INFO (config): (cfg.c:3860) }

In this instance, the filesystem for ns_target had been created as follows:

[root@aerospike52 ~]# mkfs.ext4 /dev/nvme0n1p3  
mke2fs 1.43.5 (04-Aug-2017)

[root@aerospike52 ~]# df -h  
Filesystem Size Used Avail Use% Mounted on  
/dev/nvme0n1p3 757G 73M 718G 1% /opt/namespacedata

The filesystem used to store the datafile for ns_target is on /dev/nvme0n1p3 which is being used as a RAW device for ns_flap . This will not work.

Solution

A device can either be used as a RAW device or it can have a file system and it can be used as a file device for storage-engine it cannot be both concurrently. An out of space error will be reported however the device is not out of space, the aerospike.conf is misconfigured.

The storage-engine for the namespaces must be changed such that devices are not referenced as both RAW and filesystem based.

Notes

The fdisk command can be used to partition a device such that one part of the device can have a filesystem and another will remain accessible as a RAW device.

Keywords

RAW SPACE STARTUP ERROR NODE FAIL

Timestamp

11/9/2018