Why Aerospike use only 100gb of disk

kuntypasa · April 1, 2024, 2:23pm

I have a cluster , over 3 vm. Each vm have 500 gb disk space for related namespace. But i see on amc , around %20 disk used . Avail% is most of the time under %20. What is the problem ? Please help me. Best regards.

kuntypasa · April 1, 2024, 2:24pm

kuntypasa · April 1, 2024, 2:27pm

pgupta · April 1, 2024, 9:39pm

This is a symptom of defrag not running. I cannot see based on your config why that would happen. In fact, your defrag is set to be more on the aggressive side. defrag-lwm-pct 70 where we typically recommend 50. It is a dynamic parameter - so you must check if at some point did you set it to a low value? lower than 50? You can use one of the admin tools to get the running configuration of the server nodes. Not sure which server version you are using. In older versions: asinfo -v “get-config:context=namespace;id=your_namespace_name” should get that info.

The other case where defrag will not run is if the block is still in the post-write-queue. Here too, going by your config, you have not set it to a higher number. So it is at 256 blocks, each block at 1M - so max 256Mb of the approx 15 GB data in each device. (You have 6 devices, each device gets its own post-write-queue). But that too is a dynamic parameter. So check that as well.

kuntypasa · April 2, 2024, 8:56am

Thank you for reply. Here is full conf.

# Aerospike database configuration file for use with systemd.

service {
        proto-fd-max 15000
}

logging {
        console {
                context any info
        }
}

network {
        service {
                address any
                port 3000
        }

        heartbeat {
                #mode multicast
                #multicast-group 239.1.99.222
                #port 9918

                # To use unicast-mesh heartbeats, remove the 3 lines above, and see
                # aerospike_mesh.conf for alternative.
                #
                mode mesh
                port 3002 # Heartbeat port for this node.
                mesh-seed-address-port 10.90.235.220  3002
                mesh-seed-address-port 10.90.235.222  3002

                interval 150
                timeout 10
        }

        fabric {
                port 3001
        }

        info {
                port 3003
        }
}

namespace ubflight_eng {
    replication-factor 2
    memory-size 16G
    default-ttl 3h
    nsup-period 600

    storage-engine device {
        device /dev/sdb1
        device /dev/sdb2
        device /dev/sdb3
        device /dev/sdb4
        device  /dev/sdc1
        device  /dev/sdd1
        compression lz4

        scheduler-mode noop
        write-block-size 1M
        max-write-cache 128M
        defrag-lwm-pct 70
        data-in-memory false
    }
}

namespace ubflight_datasrv {
    replication-factor 2
    memory-size 1G
    default-ttl 1d
    nsup-period 600

    storage-engine device {
        #device /dev/sdb3
        file /data/ubflight.datasrv.dat
        filesize 5G
         compression lz4
        scheduler-mode noop
        write-block-size 512K
        max-write-cache 128M
        defrag-lwm-pct 65
        data-in-memory true
    }
}

namespace flightChn {
    replication-factor 2
    memory-size 4G
    default-ttl 1d
    nsup-period 600

    storage-engine device {
        #file /data/flight.chn.dat
        #filesize 5G
        device  /dev/sde1
        compression lz4
        scheduler-mode noop
        write-block-size 2M
        max-write-cache 128M
        defrag-lwm-pct 65
        data-in-memory false # Store data in memory in addition to file.
     }
}

kuntypasa · April 2, 2024, 8:57am

Aerospike vers 6.1.0.1

Thanks

pgupta · April 2, 2024, 5:32pm

Is this Enterprise Edition?
config files is the configuration you started with. Would like to validate that nothing was changed dynamically. Can you share the output of asadm similar to below:

$ asadm -e "show config like defrag"
Seed:        [('127.0.0.1', 3000, None)]
Config_file: /home/training/.aerospike/astools.conf, /etc/aerospike/astools.conf
~~~~~~~~~~test Namespace Configuration (2024-04-02 17:31:37 UTC)~~~~~~~~~~
Node                                 |ip-172-31-69-162.ec2.internal:3000
storage-engine.defrag-lwm-pct        |50
storage-engine.defrag-queue-min      |0
storage-engine.defrag-sleep          |1000
storage-engine.defrag-startup-minimum|0

and

$ asadm -e "show config like post"
Seed:        [('127.0.0.1', 3000, None)]
Config_file: /home/training/.aerospike/astools.conf, /etc/aerospike/astools.conf
~~~~~~~test Namespace Configuration (2024-04-02 17:31:48 UTC)~~~~~~~
Node                           |ip-172-31-69-162.ec2.internal:3000
storage-engine.post-write-queue|256
Number of rows: 2

kuntypasa · April 2, 2024, 6:04pm

kuntypasa · April 2, 2024, 6:05pm

kuntypasa · April 2, 2024, 6:33pm

Version Aerospike Community Edition build 6.0.0.3

Thank you so much

pgupta · April 2, 2024, 6:38pm

Bit confused, can you share output of: $ asadm -e info

kuntypasa · April 2, 2024, 7:10pm

when i use this command , all nodes are going down , im so confused

pgupta · April 2, 2024, 7:18pm

You are using Community Edition. It only allows you to configure maximum two namespaces. That may be your issue. Known Limitations | Aerospike Documentation

kuntypasa · April 3, 2024, 6:22am

Thank you much , i try remove datasrv namespace , it look useless with 0/0 master/replica objects.

kuntypasa · April 3, 2024, 9:03pm

Hi again , which size is the best practice for flash datastore disks ?

pgupta · April 4, 2024, 4:44pm

Allowed block size in in powers of 2. (…128KB … 256KB/512KB/1M/2M/4M/8M max) Depends on your max record size and generation of SSDs.

Aerospike will not split your records across multiple write-blocks - so if you expect records of size few hundred KBs, 1M is your choice.
SSDs till many years ago showed best I/O performance at 128KB block size transfers. So that was recommended for users who had average records about 1.5KB is size. (This is how the default parameters for ACT test for characterizing SSDs are set.)
From disk utilization point of view, if you have 50KB records, and use 128KB block size - each block will have 28KB un-utilized - 2 x 50 = 100, can’t fit the 3rd record.
The larger the block size (8MB max allowed), currently, the more intense the block i/o when Aerospike moves blocks of data for defrag and other operations internally. So, keeping the block size smaller helps in that respect currently.

Hope that helps your decide the right block size.

kuntypasa · April 5, 2024, 6:33am

Thank you Sir. It was very explanatory

kuntypasa · April 5, 2024, 6:37am

By the way, we solved our problem. I reduced the number of partitions of the sdb disk to 1. Now we have 1 partition on each 1 disk.

kuntypasa · April 5, 2024, 6:50am

And the result

pgupta · April 5, 2024, 6:54am

Nice. BTW, I noted you have configured compression. If you are using Community Edition (CE), it will be silently ignored. See this link for what is supported by CE vs SE or EE (licensed versions).

Topic		Replies	Views
Defragmentation not working as expected Tuning	11	3477	July 30, 2015
Defrag not keeping up	30	4757	May 3, 2017
Defragmentation Tuning	1	2910	December 22, 2016
Unused data on the disk - Error Code 8: Server memory error Configuration	11	8162	June 29, 2017
Questions about very low disk-avail-pct	14	2815	November 12, 2019

Why Aerospike use only 100gb of disk

Related topics