Slow map operations perfomance

From the docs https://www.aerospike.com/docs/guide/cdt-map-performance.html

Why map operation are so slow? For example get_by_key is O(N)/O(logN) the same is for put.

Note these are worst case numbers, and per the legend N is the number of elements in the map. A scan search for a key in an unordered list is O(N) operation. An indexed search in an ordered list, it is O(logN). Because map keys are unique, a put needs to check if the key is already in the map, hence O(N) or O(log(N)) for put as well.

We probably should do an average case analysis for the Collection Data Type operations; posting just the worst case is too conservative. Other NoSQL databases tend to put the best cast big-O analysis, which isn’t that helpful either, unless you live in ‘sky is always blue’ land.

But when we work with java hashmap we usually have O(1) for get/put operation, in redis we also have O(1) for these operations. So in you explanations for ‘map’, under the hood, we have list of tuples and therefore we have O(logN) for ordered ‘map’. So why it’s called map?

Aerospike calls it a Map, because that’s what it is. A Map is an abstract data type, while HashMap is one possible implementation of it. What distinguishes a Map isn’t its implementation, but that it is a collection of key-value pairs, where the key appears at most once, and that provides a minimal kind of functionality, such as getting a Map value by its key.

You’re talking about big-O as if it’s a concrete thing, like a size (KiB) or a length of time (µs), but that’s not what it is. It’s a way to classify an algorithm by its runtime or space requirements. As you add elements to the Map, how will it affect the runtime of certain functions. The Java HashMap’s O(1) is not the same as the Redis Hash’s O(1), because O(1) is not a concrete thing. If we really want to be particular, the Redis Hash has a best case runtime of O(1), a worst case runtime of O(n) and an average case that amortizes to O(1) - yet none of this has anything to do with ‘slow’ or ‘fast’. O(1) !== 10µs.

Since you care about slow versus fast, and brought up big-O, let’s talk about response time and storage in something real, like a database. When you issue a command from a database client to a database server over a network you have

  • The network time from the client to the server
  • The time it takes to pick the command off the network card
  • The time it takes to fetch the object from storage or locate it in memory
  • The time it takes to perform the operation on the object - only here does your big-O point have a bearing
  • The time it takes to send the result back to the client over the network

In the non-abstract world, the amount of time spent on the operation is orders of magnitude smaller than the rest of the steps. This is typically in the nanoseconds to very-low-microseconds scale, while the rest adds up to a length of time in the hundreds or thousands of microseconds. This is why the impact of that algorithm analysis on the speed of a single call is negligible. In itself it has little to do with ‘slow’.

So why did Aerospike choose Map and not a more specific implementation like HashMap? One reason has to do with the developer API, and one has to do with storage.

Developer API

The Aerospike Map is very flexible and combines functionality that in Redis is split across Hash and SortedSet. A Hash doesn’t have functions like get-by-index or get-by-index-range, which the Aerospike Map does have.

An Aerospike Map is abstract and has three concrete representations - key-ordered, key-value-ordered, and unordered. A developer can switch between these with a single set-order operation, which isn’t something you can conveniently do in other databases.

Since most of the API works regardless of the Map order, this means that a developer can select the optimal representation for their workload (read heavy versus write heavy), while retaining almost the full range of functionality. The main thing that is different is the relative runtime cost of using a Map function when the data is ordered one way or another.

This is why the big-O performance analysis exists in the doc. It’s not to compare an Aerospike Map to a Redis Hash, it’s to compare the runtime of an Aerospike Map function between its order types.

If you want to compare Aerospike and Redis you should do that by running a proof-of-concept comparison, and see which one makes more sense for your use case at scale. You can’t compare their big-O and get anything meaningful with regards to speed.

Storage

The Aerospike Map and List data types are stored in a MessagePack format. This was chosen because of several features, which include its efficient use of space. Though an Aerospike namespace can store data in memory, the database is mainly used to store data on SSD. Choosing MessagePack saves on storage, and because the object is physically smaller, the latency of fetching it from storage is lower.

Another advantage is the way a Map or List are formatted in MessagePack - in a largely predictable and easy-to-seek way. This allows Aerospike’s Map code to go directly to the correct element in the Map when it’s performing an operation. When an Aerospike Map is stored on disk the database builds a key index just-in-time after the record is read from disk. Again, this operation is orders of magnitude faster than the time spent getting the record from storage.

When an Aerospike Map is in a namespace that is stored in memory, the key index (for a key-ordered or key-value-ordered Map) is kept in memory. A key-value-ordered Map will also have a value index kept in memory. This reflects in the different big-O runtimes shown in the worst-case ‘Performance Analysis’ pages for Map and List.

Conclusion

The main thing an Aerospike developer gets from the performance analysis is a capacity planning hint about the CPU load the operations will carry, because the same function taking place over different Map orders, in-memory or on SSD, means a different number of instructions being executed. The actual time difference is negligible, dwarfed by storage retrieval and networking.

Using the performance analysis and considering your workload, you can determine whether you want an order that is more efficient for writes versus reads, or for choosing between two functions that can give you the same results with a different runtime (CPU) cost.

I hope this ‘slow’ answer helps.

3 Likes

Thank you for your extended and great answer. I’ll try to test different cases and choose the better solution. By the way for you opinion what are advantages for using map instead of separate keys ?

I assume you’re asking about using a Map versus separate bins. If so, you have a cap of 64K unique bin names in the namespace. For most applications that’s just fine, but for some it wouldn’t be, for example if you generate time buckets for YYYYMMDDHHmm - you run out of those quickly if they were bin names. And right now it’s kind of a painful process to reclaim unused bin names for the namespace. If you use a Map it’s just a single bin names, and the server isn’t counting distinct Map keys.

There’s also less bin overhead being spent. Aerospike has no schema, so it’s flexible and you can modify the objects as you want. The objects in a set don’t have to be consistent, for example a ‘user’ set can have one record with an email bin holding a string, another record can have an email bin holding a list of strings, and a third record won’t have an email bin at all, as there’s no need for a NULL placeholder (by the way, this isn’t necessarily a great idea for your Java app :slight_smile: ). This means that the bin type information is stored along with its data, for each bin (see the Capacity Planning Guide). So a Map with lots of keys is cheaper in terms of overhead to lots of bins, though MessagePack has its own overhead.

The other advantage of Map and List is being able to deeply nest elements. You can still break up a Map into bins that are also Maps or Lists or scalar… it’s really up to you. Aerospike is flexible, sometimes too flexible, but you have different ways to model the data.

I and others have posts about modeling on the Aerospike Developer Blog. If you go to the Developer Hub there is a link to the Aerospike Summit 2020 training, which includes an 80 minute workshop I did about document-oriented modeling. I think you need to sign up for the Aerospike Academy, but the developer track content in the Academy is free.

Just came to me that you might be asking from the perspective of a Redis user, and in Redis there are no bins to hold different data types in one record, so you need to use multiple keys if you have multiple data types associated with one logical object, such as different Hashes.

The advantage of using a Map or multiple Maps with one record key is that the record access is one record lock and you can do multiple operations atomically against this one key, including on completely different data types.

If you have a logical user object, and that object is complex and has integers (counters), strings, Maps, Lists, HLLs associated with it, these can all go into one Aerospike object, which uses one record access, and can do a multi-operation (single record) transaction that is fully atomic, isolated and durable - no other reads or writes of the record happen while this transaction is happening. As opposed to a Redis MULTI you won’t get dirty reads during the transaction, and a rollback would be a complete one on all the operations.

Aerospike also has a strong consistency CP mode, if you choose, so you could choose to avoid the chance of a stale read against the replica (what would be a slave in Redis-land) using a linearized read, but that’s another topic completely.

2 Likes

Can you explain about Map overhead? For example, if we have Map with 1000 elements, that means that on each map-key updating, we will create copy for the whole object on ssd. Therefore for 1000 updating we will have 1000 copies with whole object. Can we have defragmentation problem bigger than using separate keys?

This explainer video for CRUD operations gives a good overview of the process, but defragmentation happens continuously as-is to reclaim storage, so it’s not the problem. What you should focus on is the disk IO associated with writes. and how you could optimize it.

When using a Map data type, you have the putItems method, which allows you to do multiple key-value updates in a single write operation. Similarly, you can combine multiple operations on the same record into a single transaction (using the operate command). A record transaction can perform multiple operations, update multiple bins, or elements in a Map or List with a single record access.

If you are updating constantly in a very write-heavy workload, you can optimize the updates to a Map by grouping multiple key-value updates into a single putItems operation. However, if you have a high read to write ratio, I wouldn’t worry about the disk IO of the updates, and definitely not defragmentation.