defrag-lwm-pct 50 means that when a block is left utilized only 50%, (because other 50% records got updated and re-written to a different block,) it will be combined with another such block and re-written into one new block, freeing up these two 50% used blocks. That is the defrag process.
So, write-amplification means, a write-block worth of updates coming in, will cause one more block worth of extra writes due to defrag ( 50% of the records updated coming form one block, 50% from another, these two blocks then to be written for defrag). So each block worth of client writes, turns into two blocks worth of writes on disk. If you set defrag-lwm-pct to 75%, each block worth of updates, will cause 3 additional blocks worth of writes due to defrag. i.e. 4x write-amplification. So >50 defrag-lwm-pct gives you higher disk usage but with higher disk wear (and therefore shorter disk life) due to write-amplification. 50% is therefore the recommended value for defrag-lwm-pct.
The amount of fragmentation purely depends on your expiration and update use pattern. If I only created records with exact same expiration time - say 5d - at a rate of one block worth of records per second, and never updated them, I will fill blocks with records that will expire together within a second of each other and entire block will become free with basically zero need for defrag. However, if instead, 50% of the records were “updated”, now half of the original block will become candidate for defrag.
Regarding expiration, if I filled the block with records where half were “live-for-ever” and other half had a ttl of say 1d, I will end up with 50% of the block eligible for defrag after a day.
So, how much of a block will defrag is totally dependent on your read-write-update and ttl pattern.