Load csv file

Can some one please point to load csv file into aerospike?

Also performance parameters which need to be considered for getting performance

Here is the documentation for the csv loader: https://aerospike.com/docs/tools/asloader/index.html

Having multiple data files can help but it seems the main driver would be the number of CPU (as the number of writer threads is equal to the number of CPU x 5).

Thanks for replying

  1. How to delete all the sets from the namespace?

  2. com.aerospike.client.AerospikeException: Error 4,1,0,0,0,BB9030011AC4202 127.0.0.1 3000: Parameter error at com.aerospike.client.command.WriteCommand.parseResult(WriteCommand.java:82) at com.aerospike.client.command.SyncCommand.executeCommand(SyncCommand.java:103) at com.aerospike.client.command.SyncCommand.execute(SyncCommand.java:64) at com.aerospike.client.AerospikeClient.put(AerospikeClient.java:385) at com.aerospike.load.AsWriterTask.writeToAs(AsWriterTask.java:135) at com.aerospike.load.AsWriterTask.call(AsWriterTask.java:582) at com.aerospike.load.AsWriterTask.call(AsWriterTask.java:54) at java.util.concurrent.FutureTask.run(FutureTask.java:266) ERROR AsWriterTask :157 - File: concept.csv Line: 30138Aerospike Write Error: 4 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

You can look up the truncate command for deleting sets.

Regarding the error code, you can consult the list here: https://www.aerospike.com/docs/dev_reference/error_codes.html

Error 4 indicates a bad parameter past or a server not supporting that parameter.

Do we have any limit on value for set ?

“set”: {“column_name”:“id”, “type”: “string”},

“CPP9087,CPP9087,CPP9087,CPP9087,CPP9087,CPP9087,CPP9087,CPP9087,CPP9087”

INFO AerospikeLoad :237 - Number of data files:1 INFO AerospikeLoad :241 - Aerospike loader started INFO AerospikeLoad :386 - Config file processed. INFO AerospikeLoad :408 - Reader pool size : 48 INFO PrintStat :93 - 2020-10-05 23:52:36 load(Write count=0 tps=0 Errors=0 (Timeout:0 KeyExists:0 othersWrites:0 ReadErrors:0 Processing:0) Skiped (NullKey:0 NoBins:0) Progress:0% INFO AerospikeLoad :421 - Shutdown reader thread pool INFO AerospikeLoad :795 - Processing: 70.csv INFO AerospikeLoad :790 - Reader completed 2-lines in 0.002sec, From file: 70.csv INFO AerospikeLoad :424 - Reader thread pool terminated INFO AerospikeLoad :428 - Shutdown writer thread pool ERROR AsWriterTask :157 - File: 70.csv Line: 2Aerospike Write Error: 4 INFO AerospikeLoad :431 - Writer thread pool terminated INFO AerospikeLoad :434 - Final Statistics of importer: (Records Read = 1, Successful Writes = 0, Successful Primary Writes = 0, Successful Mapping Writes = 0, Errors = 1(1-Write,0-Read,0-Processing), Skipped = 0(0-NullKey,0-NoBins) INFO AerospikeLoad :253 - Aerospike loader completed INFO AerospikeLoad :260 - Loader completed in 0.078sec

  1. also what are the parameters which needs tweaking to get better tps.

  2. If we restart the aerospike that gets wiped out…how do we persist between restart

  3. If i want to combine 3 columns from csv file to make it as key like below how do we load?

“key”: {“column_name”:“id1:id2:id3”, “type”: “integer”},

There is a limit of 1023 sets per namespace. Check the server logs to figure out potential errors.

The list of parameters is specified in the docs. Best thing to accelerate would be to separate in multiple files I think.

If you run with a persisted namespace (storage-engine device) the data should be persisted upon restart. With storage-engine memory, the data will not persist upon restart, but if running with replication-factor 2 or more, you would still have a copy (or more) of the data.

Not sure the asloader tool supports such feature to combine multiple columns to make it a key. You should be able to easily code that through a client, though.

© 2015 Copyright Aerospike, Inc. | All rights reserved. Creators of the Aerospike Database.