Feedback wanted: i added native streams (kafka-style logs + pub/sub) to the aerospike wire protocol. is my approach sane?

madmax007 · June 1, 2026, 7:07pm

Hello all,

I have been running aerospike CE for a while and kept hitting the same wall. the moment i needed event-driven stuff on top of aerospike data i had to bolt kafka next to it. two systems, two failure domains, replication lag i could never get rid of. so i tried an experiment. what if the log and the broker just lived inside the aerospike server itself?

the result is aerostream. it’s a self-contained server module that adds kafka-style stream primitives straight to the wire protocol. durable partitioned logs, consumer groups with committed offsets, replay/seek, and ephemeral pub/sub, all over the existing port 3000.

it’s based on CE 8.1.2.1 and it works end to end today. i’ve got a node client doing produce, consume, replay and pub/sub against it. but before i take it further i’d really like a sanity check from people who know the internals, because a few parts feel like i’m reaching into the engine in ways the architecture might not intend.

how it’s wired (i kept the core diff tiny on purpose):

8 new as_proto message types (10-17), dispatched from service.c. everything else lives in a new as/src/modules/aerostream/. the patch to existing files is about 40 lines.
produce: hash a partition key, assign an offset with as_faa_uint64 on an in-memory counter, then write a record (key {stream}:{partition}:{offset}) via an IOPS internal transaction (as_transaction_init_iops plus as_service_enqueue_internal).
consume: a persistent server-push session. i spawn a dedicated thread per partition that holds the client fd open and pushes STREAM_RECORD messages. records get read with as_partition_reserve, as_record_get, as_storage_rd_load_bins from those threads.
offsets: committed offsets and per-stream config are stored as normal records. offset counters get reconstructed on restart by an exponential + binary search over the log keys.
retention: per-record TTL plus nsup, no compaction process.

where i’d most love feedback:

reading records from non-service threads. the push loop does as_partition_reserve, as_record_get, as_storage_rd_load_bins outside the normal transaction path. is that safe around partition migrations, rebalance, and record locking? this is the part i trust the least.
IOPS for internal writes. is as_transaction_init_iops plus as_service_enqueue_internal the right way to drive internal writes from a module, or is there a lighter or more correct path?
holding a client fd in a dedicated push thread, outside the service/epoll pool. does that fight the connection lifecycle, or is it workable?
claiming proto type values 10-17. any conflicts, reserved ranges, or guidance i should know about?
multi-node reality. single node works, but i haven’t validated behavior under a real cluster with migrations, or how this should play with XDR.

repos:

main project (design doc, full wire protocol spec, node client, examples): https://github.com/aerostream-aerospike/aerostream
server fork with the module: https://github.com/aerostream-aerospike/aerospike-server

i’m not trying to upstream anything. this is more a “does this approach make sense, or am i about to get bitten by something the engine guarantees that i’m quietly violating?” kind of post. any pointers, war stories, or “don’t do that, here’s why” are very welcome.

Cheers

Topic		Replies	Views
Using aerospike-kafka connector in a distributed environment Kafka	0	1462	October 15, 2019
Dose aerospike support the feature of publish subscription ？	5	728	June 18, 2021
About the Kafka category Kafka	0	873	July 8, 2019
Aerospike Connect for Kafka - Inbound 2.0.1 (March 13, 2020) Releases (Server, Client & Tools)	0	534	March 14, 2020
Can we have redis pub/sub like feature on aerospike Requests java	2	1453	September 7, 2021

Feedback wanted: i added native streams (kafka-style logs + pub/sub) to the aerospike wire protocol. is my approach sane?

Related topics