Asbackup process throws timeout errors

#1

Asbackup process throws timeout errors

Problem Description

When running an asbackup process to back up data in the Aerospike cluster the process fails with the following error:

Error while running node scan for BB900000000 - code -10: Socket read error: 104, 12.34.56.78:3000, 34490 at src/main/aerospike/as_socket.c:248

Error on node aerospike.log:

WARNING (scan): (scan.c:383) error sending to 12.34.56.78:37246 - fd 539 sz 1048697 Connection timed out

Explanation

This error can happen when asbackup cannot keep up with the speed at which the server sends records. This will cause asbackup to attempt to slow the flow by reducing the speed of the scan it runs. If the push back is too hard, the scan will time out on the server, leading to the timeout message in the log. On timeout, the server closes the connection, which then makes asbackup abort with Socket read error: 104 which is a TCP connection reset.

Solution

By default, asbackup scans up to 10 nodes in parallel. This may be too many for the particular environment. If this is the case, the –parallel flag can be used to reduce the parallelism such that asbackup is not overwhelmed.

Keywords

ASBACKUP TIMEOUT SOCKET READ ERROR

Timestamp

04/17/2019