Append bytes

is there any way to append bytes without a udf? i would expect something like this to work (using python client 4.0.0)

import aerospike

ac = aerospike.client({'hosts': [('localhost', 3000)]})
ac.connect()
key = ('test', None, 0)
ac.put(key, bins={'bytes': b''})
ac.append(key, bin='bytes', val=b'a')

but it raises exception.ParamError: (-2, "Cannot concatenate 'str' and 'non-str' objects", 'src/main/client/operate.c', 1162, False)

does using a one liner udf to append bytes with https://www.aerospike.com/docs/udf/api/bytes.html#bytes-append_bytes- to a single bin cost significant overhead compared to a client operation?

Hi @avi What Python version are you using?

You can use a bytearray with the append.

import aerospike

ac = aerospike.client({'hosts': [('localhost', 3000)]})
ac.connect()
key = ('test', None, 0)
ac.put(key, bins={'bytes': b''})
ac.append(key, bin='bytes', val=bytearray('a', 'utf-8'))
print(ac.get(key)[2])

Output:

{'bytes': bytearray(b'a')}

After looking at this some more, ac.append(key, bin='bytes', val=b'a') works in Python2 but append() bytes support was likely broken during the transition to Python3. ac.append(key, bin='bytes', val=b'a') should work on Python3. We will log an internal issue ticket to track the problem.

Thank you for bringing it to our attention.

hey @Dylan_W, thanks for helping me out!

I am using python 3.8

simply wrapping the bytes with bytearray works for me here, but it looks like a different issue is preventing the appends to run in queries, which is what I actually want; here’s a repro

namespace config:

namespace dram_general {
    storage-engine memory
    memory-size 2G
    replication-factor 2
    nsup-period 60
}

code:

import time
import aerospike
from aerospike import exception, predicates
from aerospike_helpers.operations import operations

ac = aerospike.client({'hosts': [('localhost', 3000)]})
ac.connect()
key = ('dram_general', None, 0)

try:
    ac.index_string_create('dram_general', None, 'c', 'c_index')
except exception.IndexFoundError:
    pass

ac.put(key, bins={'bytes': b'', 'c': 'c0'})

b = bytearray(b'\n\x04\x12\x02\x10\x01')

ac.append(key, bin='bytes', val=b)
print(ac.get(key))
ac.append(key, bin='bytes', val=b)
print(ac.get(key))  # works fine

def append_query():
    query = ac.query('dram_general', None)
    query.where(predicates.equals('c', 'c0'))
    query.add_ops([operations.append('bytes', b)])
    job_id = query.execute_background()
    # print(ac.job_info(job_id, aerospike.JOB_QUERY))
    time.sleep(1)
    print(ac.job_info(job_id, aerospike.JOB_QUERY))
    print(ac.get(key))

for _ in range(4):
    append_query()

output:

(('dram_general', None, None, bytearray(b'\xb3F\x13{\xa9\x18\x95y\xdaR\x01b\xdf\xdc\xd6\x1a\x10\xfe7\xd6')), {'ttl': 4294967295, 'gen': 2}, {'bytes': bytearray(b'\n\x04\x12\x02\x10\x01'), 'c': 'c0'})
(('dram_general', None, None, bytearray(b'\xb3F\x13{\xa9\x18\x95y\xdaR\x01b\xdf\xdc\xd6\x1a\x10\xfe7\xd6')), {'ttl': 4294967295, 'gen': 3}, {'bytes': bytearray(b'\n\x04\x12\x02\x10\x01\n\x04\x12\x02\x10\x01'), 'c': 'c0'})
{'progress_pct': 0, 'records_read': 0, 'status': 2}
(('dram_general', None, None, bytearray(b'\xb3F\x13{\xa9\x18\x95y\xdaR\x01b\xdf\xdc\xd6\x1a\x10\xfe7\xd6')), {'ttl': 4294967295, 'gen': 4}, {'bytes': bytearray(b'\n\x04\x12\x02\x10\x01\n\x04\x12\x02\x10\x01\xff\xd8\xccO\t\xd9'), 'c': 'c0'})
{'progress_pct': 0, 'records_read': 0, 'status': 2}
(('dram_general', None, None, bytearray(b'\xb3F\x13{\xa9\x18\x95y\xdaR\x01b\xdf\xdc\xd6\x1a\x10\xfe7\xd6')), {'ttl': 4294967295, 'gen': 4}, {'bytes': bytearray(b'\n\x04\x12\x02\x10\x01\n\x04\x12\x02\x10\x01\xff\xd8\xccO\t\xd9'), 'c': 'c0'})
free(): double free detected in tcache 2
Aborted (core dumped)

it looks like despite the second query job status return 2 (success), the generation of the record is unchanged and the bytes were not appended, and then the client crashes out of python; is this a related issue? Thanks again!

edit: also just noticed that the bytes appended by the first query have been transformed somehow?

Hi @avi ,

I don’t have a lot of information to give yet but this might be related to an Issue with write operations on background queries that we are working on currently. I’m going to look into this and I’ll update you when I have more information.

1 Like

After looking into the issue further, this is related to the write operations on query bug we are working on. A fix is in the works and will be in the next release. As a temporary work around, you should be able to use a UDF. Sorry for the inconvenience. I will update this thread with further developments.

1 Like

no worries, thanks for addressing it so quickly!

Hi @avi ,

Both issues should be fixed in Aerospike Python client 5.0.0. Please give it a try when you have a chance.

1 Like

hi @Dylan_W

everything is working now thanks so much!

just to end this off with a related question: what is the time/space complexity of appending bytes? is it constant time and does it avoid the overhead of memcpy since appending doesn’t require reading before (or does it)?

The time complexity for an append operation server side is O(m + n) where m are the bytes of the particle already present, and n are the bytes being appended. During an append operation, the existing particle is read, and a new particle created from the join of the old particle and the bytes to be appended.

1 Like

This topic was automatically closed 84 days after the last reply. New replies are no longer allowed.