Inter node bandwidth: what is causing bandwidth difference?

Hi,

I got two aerospike clusters in two regions.

  • same usage (300k transactions seconds)
  • same application code using aerospike
  • 5 nodes in the cluster in one region (18 clients nodes), 7 on the another one (30 clients nodes)
  • aerospike 3.5.9

The problem is, on the region with 5 nodes, I have 1,5 Gb/s of traffic on each node to others aero nodes, and only 150 Mb/s on the region with 7 nodes.

Any hint about what can cause this brandwith différence ?

I can provide any information needed.

Regards,

Bertrand

Hi Bertrand,

WARNING

The collectinfo output is NOT anonymized therefore you could potentially be exposing information about your private infrastructure. Original message follows:

You could try a very handy tool as explained here https://www.aerospike.com/docs/tools/asmonitor/collectinfo.html Execute command sudo asmonitor -e ‘collectinfo’

This generates a tar/zip file and has information about your cluster behaviour. You could share this tar files from both of your clusters and we should be in better position to help you better.

-samir

Hi Bertrand,

I would advise against sending the output of collectinfo to a public forum as the content isn’t anonymized.

If you could instead, sanitize your aerospike.conf and provide that here. For the IP addresses in the configurations just say if they are PUBLIC, or PRIVATE.

I suspect that your heartbeats have been configured to use a public interface. In this case, you will need to change it to the ‘private’ to resolve this issue.

See interface-address for details.

Hi,

I finally found something in our app configuration.

The brandwith differences seems to be related to a massive usage of different bin names on the overloaded cluster.

When I changed the config to use a small sets of bin names, the traffic reduced to a normal level.

Anyway, one cluster use multicast (5 nodes), and the other one use mesh (no multicast available, 7 nodes).

Regards,

Bertrand

1 Like