Slow UDF issue and how to do partial reads/writes on maps


#1

Hi all,

I am experimenting with maintaining a set as a list type in aerospike.

Firstly, I tried doing this purely using the Java client: read the entire record, check if element exists in list, append to list if not, then write back to aerospike. This method achieved ~3000 writes per second on my machine.

Secondly, I tried doing the same thing using this UDF (calling the UDF using the Java client) https://www.aerospike.com/docs/udf/examples/record_udf_unique_set.html This method achieved ~300 writes per second on my machine.

I expected the second way to at least perform as well as the first way, possibly better since the record doesn’t need to be sent to the client. But the UDF was miuch slower.

Does anyone know why this might be?

Not sure if it’s related but when I executed the UDF using ascli it works but always prints Segmentation fault to console.

Thanks for any help


#2

Joshfg,

Few quick questions

  1. Which exact UDF function are you invoking … unique_set_write
  2. How big is your record size ?
  3. How many concurrent requests are your running in parallel ? or is it single client. Which client Java ?
  4. Is data-in-memory or otherwise
  5. How does your CPU utilization on server side look like when you are runing the workload.

– Raj


#3

Hi Raj,

  1. Yes, unique_set_write is the only function I’m calling
  2. The records are quite small. I am using an integer key and integer values in the set. (Set size varies between 1-100 elements). I have also tried with string values in the set and saw similar performance.
  3. For both my tests I am using 20 threads in parallel each with an instance of the AerospikeClient (v3.0.33)
  4. I am using the default configuration file with test namespace with data stored in memory as well as on disk
  5. The CPU is around 20-30% when I’m running the test

Thanks for helping,

Josh


#4

How big is your keySpace is it only 1 key ?? If that is the case UDF in general are little heavier than doing just get and put operation and performing real work of checking and updating list at the client side. But on the other hand UDF simplify application logic where it need not worry about CAS using write with genCheck.

If keySpace is bigger, I would expect CPU utilization to be higher side. If this is the case you can paste the service section in your aerospike.conf file.

– Raj


#5

My keyspace is 10,000 keys (integers between 1 and 10,000)

I haven’t changed the default aerospike configurations:

service {
        user root
        group root
        paxos-single-replica-limit 1 # Number of nodes where the replica count i
s automatically reduced to 1.
        pidfile /var/run/aerospike/asd.pid
        service-threads 4
        transaction-queues 4
        transaction-threads-per-queue 4
        proto-fd-max 15000
}

Also, a bit of an unrelated question but do you know if the map type in aerospike is backed by hash table? Or does that only apply to the lmap type?

Thanks,

Josh


#6

Joshfg,

Everything looks fine. Generally expected hit is around 30%. You may want to try few things

  1. Bump up service-threads to equal to number of cores you have and see if it helps
  2. Try running empty UDF and see how it behaves.

Btw what version for server are you running

you know if the map type in aerospike is backed by hash table?

It is hashtable http://www.aerospike.com/apidocs/c/dc/d75/structas__hashmap.html

– Raj


#7

Hi Raj,

Thanks for the suggestions. I made a couple of changes: firstly I found that I was instantiating multiple instances of the Java client (1 per thread) and changed it so that each thread shares the same client instance (as recommended in the docs). secondly I increased the number of service threads as you said. It looks like performance is much better now, hitting nearly 2000 writes per second with the UDF. I guess the difference between UDF and clientside methods will be more visible when there’s network latency (I’m currently running client and aerospike on the same machine)

I’m still unsure about the map type though. I want to have a users set where the keys are user IDs, and for each user I want to store a map of counts. e.g. something like this:

ascli put test users 123 ‘{“a”:1, “b”:2, “c”:3}’

(is this using the map data type, or is it just using three bins of type integer?)

I would rather store the entire map in a single bin, however it looks like if I do this, it is not possible to read or write counts to the map without reading/writing the whole map. Any idea what the best way to do this is?

Thanks for helping :slight_smile:


#8
ascli put test users 123 '{"a":1, "b":2, "c":3}'  

is syntax for inserting bins and its value. Use aql instead it accepts JSON strings

aql > insert into test.user(PK, bin) values ('123', 'JSON{"a":1, "b":2}')
OK, 1 record affected
aql > select * from test.user where PK='123'
+------------------+
|        bin       |
+ -----------------+
|  {"a":1, "b":2 } |
+------------------+
1 row in set (0.382 sec)

Note that using aql you can only get and set map in entirety. You may have to use UDF to read and set specific map field. Example UDF would look like

function updateMap(record, binName, key, newValue)
    local binVal = rec[binName];
    if (binVal == nil) then
        -- error jumps out of execution
        error("Does not Exist");
    end
    binVal[key] = newValue;
    aerospike:update(rec);
    return 0;
end

HTH

– Raj


#9

I see, that’s very helpful. I’ll use a UDF to do map updates then (and I guess for reads too, if I don’t want to read the whole map). Thanks!


#10

Hi Raj, I have written a UDF (below) which increments a count in a map (or writes the count if it doesn’t exist already).

It seems to work completely fine when I run it with aql.

However, when I call it ~2000 times per second with the Java client, the aerospike UI shows that less than half of the writes are succeeding! (Total writes is more than twice successful writes)

I have tried adding trace logging to debug it, but everything looks fine the logs. (rc = 0 for every update).

I have also ran stat system in aql, and the stats show 100% success rate for writes and udf writes.

Any idea why Aerospike UI shows so many failures, and what might be going wrong?

function add_count( rec, binName, mapKey, valueToAdd )
	if( not aerospike:exists(rec) ) then
		rc = aerospike:create(rec);
		if( rc == nil ) then
			warn("[ERROR] Problem creating record");
			error("ERROR creating record");
		end
		local m = map();
		m[mapKey] = valueToAdd;
		rec[binName] = m;
	else
		local binVal = rec[binName];
		if( binVal == nil ) then
			local m = map();
			m[mapKey] = valueToAdd;
			rec[binName] = m;
		else
			if( binVal[mapKey] == nil ) then
				binVal[mapKey] = valueToAdd;
			else
				binVal[mapKey] = binVal[mapKey] + valueToAdd;
			end
			rec[binName] = binVal;
		end
	end
	rc = aerospike:update(rec);
	trace("<CHECK> Check Update Result(%s)", tostring(rc));
	if( rc ~= nil and rc ~= 0 ) then
		warn("[ERROR] Record update failed: rc(%s)", tostring(rc));
		error("ERROR updating the record");
	end
    return 0;
end

#11

Joshfg

asinfo -v 'statistics' -l |grep err 

Is server counting any errors ??

– R


#12

No - all of those counters are zero!

But the Aerospike UI is displaying this: http://oi62.tinypic.com/28akxgy.jpg


#13

Joshfg,

AMC only report pure get/set calls. What are you are seeing is writes and its success count.

You can trust the status code being returned in the client. Is client reporting any errors while executing UDF ?

– R


#14

No I’m not seeing any errors in the client or in the logs.

What causes the difference between total and success then?

I want to compare performance to the unique_set_write UDF which I was executing earlier - for that UDF success was always equal to total and it was consistently over 2000 successful writes per second.

For this UDF I’ve written for writing values to the map, AMC is showing around 1000 successful writes per second (although total is higher than this).


#15

Looking at “udf_write_success” counter and calculating writes per second myself, I can see that the ‘success’ reported by AMC is correct - it is running at around 1000 writes per second. (which is less than expected, since I get 2000 writes per second with unique_set_write UDF) - Do you think there might be something wrong?

Thanks,

Josh

(this is the UDF I am currently running to write to the maps)

function write_val( rec, binName, mapKey, newValue )
	if( not aerospike:exists(rec) ) then
		rc = aerospike:create(rec);
		if( rc == nil ) then
			warn("[ERROR] Problem creating record");
			error("ERROR creating record");
		end
		local m = map();
		m[mapKey] = newValue;
		rec[binName] = m;
	else
		local binVal = rec[binName];
		if( binVal == nil ) then
			local m = map();
			m[mapKey] = newValue;
			rec[binName] = m;
		else
			binVal[mapKey] = newValue;
			rec[binName] = binVal;
		end
	end
	rc = aerospike:update(rec);
	if( rc ~= nil and rc ~= 0 ) then
		warn("[ERROR] Record update failed: rc(%s)", tostring(rc));
		error("ERROR updating the record");
	end
    return 0;
end

#16

Well I’m not sure what’s going on with that. I’ve been changing the number of threads which are calling the UDF using the Java client and now it’s a lot better (success is around 90% of total). I’d still like to know why it’s not 100% though…


#17

Joshfg,

Sorry this fell through the cracks.

udf_write_success should be the number you should look at from server side to know what it thinks was successful server writes.

** unique_set_write ** Is this the number you are measuring on the client side. ??

There seems to be something not adding up … is this a >1 node cluster ?

Total Tps - Successful TPS is either Error or Just that tps ended up tailing in that window. Can you share

asinfo -v 'statistics’ asinfo -v 'namespace/' sudo asloglatency -h udf -f ‘-1000’ {run this when load is running}

– R


#18

Hi Raj,

Yes I’m looking at udf_write_success on the server and that looks good (there are no failures)

unique_set_write isn’t a number that was just a UDF I was executing where I didn’t see the problem in AMC. The problem only occurs with the UDF I wrote to add values to map types.

It is just a 1 node cluster running on my local machine. Here are the results from those commands:

asinfo -v ‘statistics’

    cluster_size=1;cluster_key=5481A4CEF9C3C8AD;cluster_integrity=true;objects=173205;sub-records=0;total-bytes-disk=5368709120;used-bytes-disk=44500480;free-pct-disk=99;total-bytes-memory=2147483648;used-bytes-memory=23071352;data-used-bytes-memory=11986232;index-used-bytes-memory=11085120;sindex-used-bytes-memory=0;free-pct-memory=98;stat_read_reqs=0;stat_read_reqs_xdr=0;stat_read_success=0;stat_read_errs_notfound=0;stat_read_errs_other=0;stat_write_reqs=868161;stat_write_reqs_xdr=0;stat_write_success=435902;stat_write_errs=0;stat_xdr_pipe_writes=0;stat_xdr_pipe_miss=0;stat_delete_success=0;stat_rw_timeout=0;udf_read_reqs=432259;udf_read_success=432251;udf_read_errs_other=8;udf_write_reqs=435902;udf_write_success=435902;udf_write_err_others=0;udf_delete_reqs=0;udf_delete_success=0;udf_delete_err_others=0;udf_lua_errs=0;udf_scan_rec_reqs=0;udf_query_rec_reqs=0;udf_replica_writes=0;stat_proxy_reqs=0;stat_proxy_reqs_xdr=0;stat_proxy_success=0;stat_proxy_errs=0;stat_ldt_proxy=0;stat_cluster_key_trans_to_proxy_retry=0;stat_cluster_key_transaction_reenqueue=0;stat_slow_trans_queue_push=0;stat_slow_trans_queue_pop=0;stat_slow_trans_queue_batch_pop=0;stat_cluster_key_regular_processed=0;stat_cluster_key_prole_retry=0;stat_cluster_key_err_ack_dup_trans_reenqueue=0;stat_cluster_key_partition_transaction_queue_count=0;stat_cluster_key_err_ack_rw_trans_reenqueue=0;stat_expired_objects=0;stat_evicted_objects=0;stat_deleted_set_objects=0;stat_evicted_set_objects=0;stat_evicted_objects_time=0;stat_zero_bin_records=0;stat_nsup_deletes_not_shipped=0;err_tsvc_requests=0;err_out_of_space=0;err_duplicate_proxy_request=0;err_rw_request_not_found=0;err_rw_pending_limit=0;err_rw_cant_put_unique=0;fabric_msgs_sent=0;fabric_msgs_rcvd=0;paxos_principal=BB94ECD4C270008;migrate_msgs_sent=0;migrate_msgs_recv=0;migrate_progress_send=0;migrate_progress_recv=0;migrate_num_incoming_accepted=0;migrate_num_incoming_refused=0;queue=0;transactions=869509;reaped_fds=164;tscan_initiate=0;tscan_pending=0;tscan_succeeded=0;tscan_aborted=0;batch_initiate=0;batch_queue=0;batch_tree_count=0;batch_timeout=0;batch_errors=0;info_queue=0;delete_queue=0;proxy_in_progress=0;proxy_initiate=0;proxy_action=0;proxy_retry=0;proxy_retry_q_full=0;proxy_unproxy=0;proxy_retry_same_dest=0;proxy_retry_new_dest=0;write_master=868161;write_prole=0;read_dup_prole=0;rw_err_dup_internal=0;rw_err_dup_cluster_key=0;rw_err_dup_send=0;rw_err_write_internal=0;rw_err_write_cluster_key=0;rw_err_write_send=0;rw_err_ack_internal=0;rw_err_ack_nomatch=0;rw_err_ack_badnode=0;client_connections=102;waiting_transactions=0;tree_count=0;record_refs=173205;record_locks=0;migrate_tx_objs=0;migrate_rx_objs=0;ongoing_write_reqs=0;err_storage_queue_full=0;partition_actual=4096;partition_replica=0;partition_desync=0;partition_absent=0;partition_object_count=173205;partition_ref_count=4096;system_free_mem_pct=86;sindex_ucgarbage_found=0;sindex_gc_locktimedout=0;sindex_gc_inactivity_dur=0;sindex_gc_activity_dur=0;sindex_gc_list_creation_time=0;sindex_gc_list_deletion_time=0;sindex_gc_objects_validated=0;sindex_gc_garbage_found=0;sindex_gc_garbage_cleaned=0;system_swapping=false;err_replica_null_node=0;err_replica_non_null_node=0;err_sync_copy_null_node=0;err_sync_copy_null_master=0;storage_defrag_corrupt_record=0;err_write_fail_prole_unknown=0;err_write_fail_prole_generation=0;err_write_fail_unknown=0;err_write_fail_key_exists=0;err_write_fail_generation=0;err_write_fail_generation_xdr=0;err_write_fail_bin_exists=0;err_write_fail_parameter=0;err_write_fail_incompatible_type=0;err_write_fail_noxdr=0;err_write_fail_prole_delete=0;err_write_fail_not_found=0;err_write_fail_key_mismatch=0;stat_duplicate_operation=0;uptime=800;stat_write_errs_notfound=0;stat_write_errs_other=0;heartbeat_received_self=5161;heartbeat_received_foreign=0;query_reqs=0;query_success=0;query_fail=0;query_abort=0;query_avg_rec_count=0;query_short_queue_full=0;query_long_queue_full=0;query_short_running=0;query_long_running=0;query_tracked=0;query_agg=0;query_agg_success=0;query_agg_err=0;query_agg_abort=0;query_agg_avg_rec_count=0;query_lookups=0;query_lookup_success=0;query_lookup_err=0;query_lookup_abort=0;query_lookup_avg_rec_count=0

asinfo -v ‘namespace/test’

type=device;objects=173205;sub-objects=0;master-objects=173205;master-sub-objects=0;prole-objects=0;prole-sub-objects=0;expired-objects=0;evicted-objects=0;set-deleted-objects=0;set-evicted-objects=0;used-bytes-memory=24144519;data-used-bytes-memory=13059399;index-used-bytes-memory=11085120;sindex-used-bytes-memory=0;free-pct-memory=98;max-void-time=161344270;non-expirable-objects=0;current-time=160912270;stop-writes=false;hwm-breached=false;available-bin-names=32765;ldt-reads=0;ldt-read-success=0;ldt-deletes=0;ldt-delete-success=0;ldt-writes=0;ldt-write-success=0;ldt-updates=0;ldt-gc-io=0;ldt-gc-cnt=0;ldt-randomizer-retry=0;ldt-errors=0;ldt-err-toprec-notfound=0;ldt-err-item-notfound=0;ldt-err-internal=0;ldt-err-unique-key-violation=0;ldt-err-insert-fail=0;ldt-err-delete-fail=0;ldt-err-search-fail=0;ldt-err-version-mismatch=0;ldt-err-capacity-exceeded=0;ldt-err-param=0;ldt-err-op-bintype-mismatch=0;ldt-err-too-many-open-subrec=0;ldt-err-subrec-not-found=0;ldt-err-bin-does-not-exist=0;ldt-err-bin-exits=0;ldt-err-bin-damaged=0;ldt-err-toprec-internal=0;ldt-err-subrec-internal=0;ldt-err-transform-internal=0;ldt-err-unknown=8;used-bytes-disk=45548672;free-pct-disk=99;available_pct=99;memory-size=2147483648;high-water-disk-pct=50;high-water-memory-pct=60;evict-tenths-pct=5;stop-writes-pct=90;cold-start-evict-ttl=4294967295;repl-factor=1;default-ttl=432000;max-ttl=0;conflict-resolution-policy=generation;allow_versions=false;single-bin=false;ldt-enabled=true;enable-xdr=false;sets-enable-xdr=true;ns-forward-xdr-writes=false;disallow-null-setname=false;total-bytes-memory=2147483648;read-consistency-level-override=off;write-commit-level-override=off;total-bytes-disk=5368709120;defrag-lwm-pct=50;defrag-queue-min=0;defrag-sleep=1000;defrag-startup-minimum=10;flush-max-ms=1000;fsync-max-sec=0;write-smoothing-period=0;max-write-cache=67108864;min-avail-pct=5;post-write-queue=0;data-in-memory=true;file=/opt/aerospike/data/test.dat;filesize=5368709120;writethreads=1;writecache=67108864;obj-size-hist-max=100

sudo asloglatency -h udf -f ‘-1000’

               % > (ms)
slice-to (sec)      1      8     64  ops/sec
-------------- ------ ------ ------ --------
09:45:38    10   0.00   0.00   0.00   1740.0
09:45:48    10   0.00   0.00   0.00   1841.0
09:45:58    10   0.00   0.00   0.00   1795.2
09:46:08    10   0.00   0.00   0.00   1351.8
09:46:18    10   0.00   0.00   0.00   1958.5
09:46:28    10   0.00   0.00   0.00   1877.1
09:46:38    10   0.00   0.00   0.00   1888.4
09:46:48    10   0.00   0.00   0.00   2003.8
09:46:58    10   0.00   0.00   0.00   1692.7
09:47:08    10   0.00   0.00   0.00   1393.8
09:47:18    10   0.00   0.00   0.00   1393.9
09:47:28    10   0.00   0.00   0.00   1955.3
09:47:38    10   0.00   0.00   0.00   1901.6
09:47:48    10   0.00   0.00   0.00   1510.2
09:47:58    10   0.00   0.00   0.00   1921.6
09:48:08    10   0.19   0.00   0.00   1863.8
09:48:18    10   0.87   0.00   0.00   1757.8
09:48:28    10   0.65   0.00   0.00   1624.0
09:48:38    10   0.84   0.00   0.00   1766.6
09:48:48    10   1.03   0.00   0.00   1759.3
09:48:58    10   1.38   0.00   0.00   1715.3
09:49:08    10   1.53   0.07   0.00   1730.8
09:49:18    10   1.53   0.10   0.00   1776.8
09:49:28    10   1.54   0.07   0.00   1612.8
09:49:38    10   1.98   0.06   0.00   1898.7
09:49:48    10   1.86   0.12   0.00   1850.0
09:49:58    10   1.82   0.07   0.00   1803.7
09:50:08    10   1.41   0.07   0.00   1939.1
09:50:18    10   1.51   0.16   0.00   1843.3
09:50:28    10   2.13   0.26   0.00   1606.7
09:50:38    10   2.36   0.22   0.00   1575.1
09:50:48    10   2.21   0.12   0.00   1792.9
09:50:58    10   1.89   0.08   0.00   1856.0
09:51:08    10   2.34   0.09   0.00   1921.8
09:51:18    10   2.30   0.14   0.00   1878.1
09:51:28    10   2.86   0.11   0.00   1708.5
09:51:38    10   2.61   0.16   0.00   1780.4
09:51:48    10   2.80   0.22   0.00   1758.6

Let me know if you spot anything wrong there!

Thanks,

Josh


#19
stat_write_reqs=868161;stat_write_reqs_xdr=0;stat_write_success=435902;stat_write_errs=0;stat_xdr_pipe_writes=0;stat_xdr_pipe_miss=0;stat_delete_success=0;stat_rw_timeout=0;udf_read_reqs=432259;udf_read_success=432251;udf_read_errs_other=8;udf_write_reqs=435902;udf_write_success=435902;

All udf request went fine. Your throughput of UDF on the server side is last column in the asloglatency output.

No errors recorded and success not incremented would mean somewhere the stats update code is not being called properly … I will have to invesigate. Can you share entire conf file I will try to reproduce …

– R


#20

I see… it’s a very strange bug!

I am using the default configuration, but have pasted it here: http://pastebin.com/ykT3hW74

I can see the issue using just one thead with the Java client, calling execute on the write_val UDF that I pasted earlier. (with one thread I get around 40 writes/second and AMC shows total > success)

Thanks for helping with this!