Aerospike backup fails with code 4

Hello everyone! We encountered an error while creating an aerospike backup:

2021-04-30 06:55:09 GMT [INF] [68013] 25% complete (~44319 KiB/s, ~6042 rec/s, ~7510 B/rec)
2021-04-30 06:55:09 GMT [INF] [68013] ~1h19m2s remaining

2021-04-30 06:55:20 GMT [INF] [68013] 25% complete (~48098 KiB/s, ~6602 rec/s, ~7459 B/rec)
2021-04-30 06:55:20 GMT [INF] [68013] ~1h12m9s remaining

2021-04-30 06:55:30 GMT [INF] [68013] 25% complete (~46766 KiB/s, ~6397 rec/s, ~7485 B/rec)
2021-04-30 06:55:30 GMT [INF] [68013] ~1h14m18s remaining

2021-04-30 06:55:40 GMT [INF] [68013] 25% complete (~45469 KiB/s, ~6277 rec/s, ~7417 B/rec)
2021-04-30 06:55:40 GMT [INF] [68013] ~1h15m32s remaining

2021-04-30 07:29:50 GMT [ERR] [68014] Error while running node scan for BB92200069A6F56 - code 4: AEROSPIKE_ERR_REQUEST_INVALID at src/main/aerospike/aerospike_scan.c:313

2021-04-30 07:29:51 GMT [INF] [68013] Backed up 9874694 record(s), 0 secondary index(es), 1 UDF file(s) from 3 node(s), 75486817756 byte(s) in total (~7644 B/rec)

An error can occur at any stage of creating a backup.

Info about cluster:

Cluster size - 3 nodes

Real factor - 2.

Aerospike Community Edition build 5.2.0.10

OS: CentOS Linux release 8.3.2011

kernel: 4.18.0-240.1.1.el8_3.x86_64

Please tell me what could be the problem? Thank you in advance.

Interesting. Does this occur on the same server consistently? on all servers in the cluster? have you tried installing the latest astools? what version of astools is installed? is there any corresponding error on the server ? grep -v INFO /var/log/aerospike/aerospike.log || journalctl -u aerospike --since=-7d|grep -v INFO

I think you can find latest download versions/install instructions through this link https://docs.aerospike.com/docs/operations/install/tools/

This occur on all servers in the cluster. No, i don’t tried install the latest astools (because this is production servers).

Some additional info: Aerospike Backup Utility Version 3.4.1

` grep -v INFO /var/log/aerospike/aerospike.log || journalctl -u aerospike --since=-7d|grep -v INFO:

May 25 2021 04:34:03 GMT+0300: WARNING (scan): (scan.c:528) error sending to 10.2.2.15:60986 - fd 79 sz 1049679 Connection timed out

May 25 2021 11:21:53 GMT+0300: WARNING (scan): (scan.c:528) error sending to 127.0.0.1:41412 - fd 266 sz 1057891 Connection timed out

May 25 2021 11:50:15 GMT+0300: WARNING (scan): (scan_manager.c:111) job with trid 6098906213693522705 already active

May 25 2021 11:50:15 GMT+0300: WARNING (scan): (scan.c:803) basic scan job 6098906213693522705 failed to start (4)`

Seems there are timeouts on the connection when the server tries to send back to the client… those happen before the main error which is the job with trid xxxx already active. (Refer to this article for details on that error).

I am surprised that asbackup would initiate a retry of the scan… are you using the tools version that came with the server package?

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.