Feedback wanted: i added native streams (kafka-style logs + pub/sub) to the aerospike wire protocol. is my approach sane?

Hello all,

I have been running aerospike CE for a while and kept hitting the same wall. the moment i needed event-driven stuff on top of aerospike data i had to bolt kafka next to it. two systems, two failure domains, replication lag i could never get rid of. so i tried an experiment. what if the log and the broker just lived inside the aerospike server itself?

the result is aerostream. it’s a self-contained server module that adds kafka-style stream primitives straight to the wire protocol. durable partitioned logs, consumer groups with committed offsets, replay/seek, and ephemeral pub/sub, all over the existing port 3000.

it’s based on CE 8.1.2.1 and it works end to end today. i’ve got a node client doing produce, consume, replay and pub/sub against it. but before i take it further i’d really like a sanity check from people who know the internals, because a few parts feel like i’m reaching into the engine in ways the architecture might not intend.

how it’s wired (i kept the core diff tiny on purpose):

  • 8 new as_proto message types (10-17), dispatched from service.c. everything else lives in a new as/src/modules/aerostream/. the patch to existing files is about 40 lines.

  • produce: hash a partition key, assign an offset with as_faa_uint64 on an in-memory counter, then write a record (key {stream}:{partition}:{offset}) via an IOPS internal transaction (as_transaction_init_iops plus as_service_enqueue_internal).

  • consume: a persistent server-push session. i spawn a dedicated thread per partition that holds the client fd open and pushes STREAM_RECORD messages. records get read with as_partition_reserve, as_record_get, as_storage_rd_load_bins from those threads.

  • offsets: committed offsets and per-stream config are stored as normal records. offset counters get reconstructed on restart by an exponential + binary search over the log keys.

  • retention: per-record TTL plus nsup, no compaction process.

where i’d most love feedback:

  1. reading records from non-service threads. the push loop does as_partition_reserve, as_record_get, as_storage_rd_load_bins outside the normal transaction path. is that safe around partition migrations, rebalance, and record locking? this is the part i trust the least.

  2. IOPS for internal writes. is as_transaction_init_iops plus as_service_enqueue_internal the right way to drive internal writes from a module, or is there a lighter or more correct path?

  3. holding a client fd in a dedicated push thread, outside the service/epoll pool. does that fight the connection lifecycle, or is it workable?

  4. claiming proto type values 10-17. any conflicts, reserved ranges, or guidance i should know about?

  5. multi-node reality. single node works, but i haven’t validated behavior under a real cluster with migrations, or how this should play with XDR.

repos:

i’m not trying to upstream anything. this is more a “does this approach make sense, or am i about to get bitten by something the engine guarantees that i’m quietly violating?” kind of post. any pointers, war stories, or “don’t do that, here’s why” are very welcome.

Cheers :slight_smile: