Aerospike vs HANA or other RDBMS systems


#1

Hi,

From whatever limited knowledge I have gathered about Aerospike, I know that it provides secondary indexes and it also provides LDT’s which kind of serve the purpose of denormalizing the table structures in traditional RDBMS systems.

Both of these features and the tabular, bin/record based functionality makes me feel that Aerospike can actually replace RDBMS systems like HANA which markets itself as an in memory relational database.

Am I right in my assumption that Aerospike could potentially totally replace RDBMS systems. Yes, aerospike does not provide features like joins across many tables which are sometimes useful for reporting purposes. But there could be a way around that too by using UDFs?

Sorry for my long winded question. But I have a case in my company where we r trying to choose between HANA and Aerospike. HANA has another advantage that it provides an organic Predictive Analytics Library built in with tight integration with R. You could write stored procedures intermingled with R code and run predictive analytics and do statistical stuff.

I like Aerospike a lot because it provides automatic failover, clustering, redundancy and it can handle very high requests per second. Can HANA handle that kind of loads and provide automatic clustering, automatic addition of nodes on need basis etc.

Thanks a lot for your answers.

Regards, Samar


#2

SAP HANA is not a distributed system. High Availablity of HANA is based on master/slave replication system.

No Aerospike is not for replacing RDBMS system. There are lot of features in RDBMS which are not available in Aerospike. Also, you need to understand the consistency guarantees which Aerospike provides, given it is a AP system. Also note that Aerospike is not just pure in-memory, but optimized or flash based system as well.

Can you please share in layman’s language (with mock object representation) what is that you are trying to do. Given HANA is in your shortlist, you probably have very high performance needs. Can you elaborate on these?

– R


#3

Thanks for your reply Raj. I have some further questions/doubts.

  1. You said HANA is a master/slave system. Does this mean that the sharding logic has to be written manually?
  2. From what I understand about Aerospike, it doesn’t matter which node the data goes to. Aerospike will automatically fetch the data for you from the right node. Is that correct?
  3. Also Aerospike automatically adds more nodes as the data size reaches a preset limit. So no need to manually add nodes? Is that corrrect?
  4. Aerospike automatically provides a replica node for all nodes. SO failover is automatic? IS that correct?

Can you list the features available in HANA that are not reproducible in Aerospike. There are stored procedures in HANA and there are UDF’s in Aerospike. There are aggregations and secondary indexes available in Aerospike which would cover most querying needs. Yes, RDBMS systems are powerful when there is a need to do complex joins across many tables and for that the SQL language is quiet succint and powerful. So in this particular use case I do see RDBMS systems having an advantage over Aerospike, because in Aerospike we will have to write client side processing or udfs to do complex data analysis/querying.

Now to answer your question as to what we are trying to do:

Our app continuously gathers large streams of sensor data which means we have high TPS needs in our data store. Apart from this we need to do processing on the sensor stream data using Aerospike client. I am confident that this part will be handled very well by Aerospike as it markets itself to handle high TPS loads and is very fast.

The second part is our need to analyze/dissect the processed data and run predictive analytics as well on it. HANA touts itself to tightly integrate predictive analytics abilities with it, so thats a big plus for HANA. Aerospike does not currently provide such predictive analytics tools/integrations.

Last point, you said that we probably have very high performance needs because we are thinking of using HANA. How is HANA more performant than Aerospike? Are there any benchmarks for that?

Sorry if my query is a bit long winded. Would really appreciate your responses on these doubts.

Thanks a lot.


#4

There is no sharding. One node is master another is hot standby (you will need to configure is accordingly). You may want to read HANA documentation for it.

Yes

Incorrect !! Aerospike is not a service. If you need more nodes in the cluster you need to add it.

Aerospike by default maintains two copy of the data. If one copy goes offline other serves the request.

HANA is RDBMS. Aerospike is not. Aerospike does not support Multirecord transactions …

No there are no comparitive benchmark for HANA and Aerospike. HANA is seen as high performant store in RDBMS world hence the question.

– R


#5

Thanks again Raj for your helpful response.