What would happen when add a exists key into set

version 4.5.3.10 we used a set to achieve keys unique ,that is to say we use set to avoid duplicate keys Our question is what would happen when we add an exists key into a set , does aerospike will rewrite the same value to the set and do some IO work ,or juse do nothing and return failed?

A key in aerospike consists of a tuple of (Namespace-name, Set-Name, Value) which are then hashed using RIPEMD160. When writing to an existing key, the application can choose to update the current data (default) or replace the current data with that from the client. The application may also choose to fail the transaction based on the existence or non-existence of the key.

1 Like

Here is the relevant policy (using Java for this example): RecordExistsAction (aerospike-client 6.0.0 API)

1 Like

thank you very much ,the link page is very helpful . what is the meaning of create_only , does that mean if the key exists then do nothing ?

That is correct. When using CREATE_ONLY, if the key exists the put will fail.

1 Like

thank you very much

for check key exists we have two strategy

  • 1 first to get ,if get failed then to add else do nothing
  • 2 just add with create_only

if 90% keys already exist, which strategy will cost less

I don’t understand the use case

we use aerospike set to store keys ,by using “create_only” parameter ,we can keep the keys in set is unique ,we can also achive that in this way : first get the key from set ,if get failed ,means key not exists, then add to set , if get success , means key already exists we do not add it to set

we want to use a way which is cost less cpu and io resource

way 1 :

if [ get key success ] {## key exists do nothing }else{ key not exists add key to set }

way 2:

add key to set with create_only parameters

I understand your question. I don’t know how the server is implemented in details, but would assume (I’ll let you know if I find out this is not the case) that the check for CREATE_ONLY is as expensive as trying to read a record. In both cases, it is a check against the primary index only.

So, if you expect 90% of the keys to already exist, reading first to check (and you can do a simple ‘head’ to only get metadata and not even return any bin values) may be ‘lighter’ since it would save on network bandwidth as you wouldn’t have to unnecessarily send any data payload (which you have to on a write with the CREATE_ONLY policy). But I don’t know where the ‘threshold’ would be to always do a write and save on an extra transaction itself
 if 95%? Or 50%? I guess it would also depend on the network latency between client and server.

Finally, things like TLS or encryption may have other effect that may affect this a bit.

2 Likes

I actually got confirmation that the server side would be as expensive for both, as expected. If the records to write are small, then go with the write with CREATE_ONLY policy (simpler code and more readable too). But if the records to write are large, then it may make sense to do a read first as those would always be as lightweight as can be.

1 Like

Just to add: CREATE_ONLY will be needed also for correctness if there are multiple clients. Two separate test and put requests can overwrite prior puts unless the put is CREATE_ONLY. You may still want to perform an existence check if the write size is large.

1 Like

This topic was automatically closed 84 days after the last reply. New replies are no longer allowed.