Multi-record groups - an alternative to (now deprecated) LDTs


#1

Currently, an Aerospike record is:

  1. All by itself. Non-stream UDFs can only modify one record at a time. Record operations only work on one record.
  2. To store related data across multiple records (e.g. buckets), (i) multiple client calls are usually needed to identify the buckets and then to get each bucket, and (ii) calls are not atomic.

Suggestion: The Multi-Record Group

  1. Composed of (i) one parent and (ii) zero or more children, all living on the same node
  2. During a write or UDF all, the parent and all children are accessible together
  3. If a parent is identified by a PK, then children could be identified by a PK plus an INDEX.

Benefits

  1. Allows efficient record updates by avoiding a full rewrite of all data (rather, just one child is updated)
  2. Is a good alternative to LDTs for exceeding a record’s max size.
  3. Allows bucketing/data reorganization optimization to be implemented where updates/gets are done in a single, atomic call.

Use Cases

  1. Ever-growing, mostly-append-or-get lists. Inserts go into a child until a max size threshold; subsequent inserts go to the next child.
  2. Circular, read/write balanced lists. Use child records as nodes in a circular fashion.
  3. Balanced read/write maps. Shard entries by key and insert into appropriate child record(s).