Defrag_q and write per second

Armen_Armenakyan · October 23, 2019, 4:01pm

Our testing cluster has 3 nodes with SSDs. So far the performance seems good but probably needs a bit of tuning. We have 40k w/s (prepending lists and making removeRange queries) on average which it handles easily after some configuration. During peaks, this number can grow a lot larger, which I am limiting to ~80k w/s over some period. When it gets 70k+, defrag_q starts constantly growing until the peak is done. Already tried playing around with defrag_sleep but no help.

defrag_q/wps

So the question is, am I reaching the limits of the current SSDs? Is there any configuration I can play around with ?

kporter · October 23, 2019, 5:45pm

Keep in mind, configuring defrag to be less aggressive can result in defrag not keeping up with the write load and eventually hitting stop writes due to lack of available ‘clean’ write blocks. Also making it overly aggressive can negatively affect your peak performance.

The primary parameter to tune defrag is the defrag-lwm-pct. By default this is 50% which causes a 2x write aplification. The write amplification caused by this parameter grows non-linearly.

More information about defrag can be found here: Defragmentation

Armen_Armenakyan · October 23, 2019, 6:33pm

What do you mean by write amplification?

kporter · October 23, 2019, 7:15pm

By write-amplification, I am referring to the additional writes required by the defrag process.

Armen_Armenakyan · October 23, 2019, 7:29pm

Got you, 2x part confuses me a bit. You don’t mean I get twice the amount of actual writes, right? Is it even possible to make assumptions on the amount of fragmentation just by looking at defrag-lwm-pct?

pgupta · October 24, 2019, 12:16am

defrag-lwm-pct 50 means that when a block is left utilized only 50%, (because other 50% records got updated and re-written to a different block,) it will be combined with another such block and re-written into one new block, freeing up these two 50% used blocks. That is the defrag process.

So, write-amplification means, a write-block worth of updates coming in, will cause one more block worth of extra writes due to defrag ( 50% of the records updated coming form one block, 50% from another, these two blocks then to be written for defrag). So each block worth of client writes, turns into two blocks worth of writes on disk. If you set defrag-lwm-pct to 75%, each block worth of updates, will cause 3 additional blocks worth of writes due to defrag. i.e. 4x write-amplification. So >50 defrag-lwm-pct gives you higher disk usage but with higher disk wear (and therefore shorter disk life) due to write-amplification. 50% is therefore the recommended value for defrag-lwm-pct.

The amount of fragmentation purely depends on your expiration and update use pattern. If I only created records with exact same expiration time - say 5d - at a rate of one block worth of records per second, and never updated them, I will fill blocks with records that will expire together within a second of each other and entire block will become free with basically zero need for defrag. However, if instead, 50% of the records were “updated”, now half of the original block will become candidate for defrag.

Regarding expiration, if I filled the block with records where half were “live-for-ever” and other half had a ttl of say 1d, I will end up with 50% of the block eligible for defrag after a day.

So, how much of a block will defrag is totally dependent on your read-write-update and ttl pattern.

kporter · October 24, 2019, 12:29am

The write-amplification is referring to the system at equilibrium state, where the amount of data being added equals the amount being removed. At equilibrium with default defrag configuration, for every large-block write from the write-path there is expected to be one large-block write from defrag.

Armen_Armenakyan · October 24, 2019, 7:18am

@pgupta @kporter. That clears a lot up, thanks a lot. I will try playing with defrag-lwm-pct. I will also try compression to get a bit of load off the SSD(I suspect it should help).

system · October 30, 2019, 7:18am

This topic was automatically closed 6 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Defrag-lwm-pct and device usage - what factor to apply for hardware capacity Planning	3	2710	April 6, 2016
Write-q stuck at max	10	1432	June 6, 2023
It'd great to add the `defrag-q`s metric (AER-4969) Delivered Requests defrag	5	1977	January 17, 2019
Seeing a spike in defragmentation queue when XDR is enabled XDR (Cross Data Center Replication)	2	668	January 21, 2022
The interval to flush data into Disk Configuration	3	2477	August 16, 2014

Defrag_q and write per second

Related topics