Bucketing / splitting data

Regarding the calculation, the max record size would actually be 8MiB (based on the write-block-size configured). Also, don’t forget the extra overhead to account for, as detailed in the Capacity Planning doc. But that shouldn’t matter that much, you already know you cannot fit all entries in a record.

Regarding insertion speed, I am not sure what the base line for the Python client is, but I would hope it is not horrible. Seems like there is an example for using multiple threads on the file below, which may help: