Python client write error - Node not found for partition

I have a 3 node Aerospike community cluster on AWS with the following config

  • instance_type - default = c5d.large
  • cluster_hosts_per_az - default = 1
  • partitions_per_device - default = 4 ( nvme volumes will automatically be partitioned for you )
  • enterprise - default = false to allow single click setup of Community. If true, a feature key location must be specified ( see below )
  • encryption_at_rest - default = false. If true aerospike.conf will be appropriately modified and a key file generated
  • tls_enabled - default = false. If true aerospike.conf will be appropriately modified and all certificates appropriately located. Connecting clients will be appropriately configured.
  • strong_consistency - default = false. If true aerospike.conf will be appropriately modified. You will currently have to set your own roster.
  • monitoring_enabled - default = false. If true the Aerospike Prometheus agent will be installed, configured and started on the cluster nodes.
  • aerospike_distribution - default = el6. Determines the distribution used.
  • aerospike_version - default = latest
  • ami_locator_string - the latest version of the AMZN2 AMI is used ( dynamically looked up). Other builds can be used by modifying this string.
  • replication_factor - default = 2
  • aerospike_mem_pct - fraction of available memory to allocate to the ‘test namespace’. Default = 80%
  • feature_key - path for an Enterprise feature key. Undefined by default so the setup works out of the box.

I have a t2-medium machine in the same region running the python client. The client is a FastAPI webapp. The object I am writing is a simple dict. I create a unique key for every write and thus

@app.post('/')
def post_data(item: Item):
    pk = None
    try:
        i = item.dict()
        pk = str(shortuuid.uuid())
        key = ('test', 'pydemo', pk)
        client.put(key, i)
        return {"message": "Write success", "key": pk}
    except Exception as e:
        print("Error while inserting pk {} ::: {}".format(pk, e))
        return {"error": e}

Everything goes fine with writes and reads but once I start hitting simultaneous writes using Apache bench - I get the following error

(-8, 'Node not found for partition test:3023', 'src/main/client/put.c', 111, False)

Any help is appreciated. I am not closing the connection anywhere. Here is the full code

from fastapi import FastAPI
import aerospike
from pydantic import BaseModel
import shortuuid

class Item(BaseModel):
    name: str
    description: str = None
    price: float
    tax: float = None

app = FastAPI()

aero_config = {"hosts": [ ("100.26.108.213", 3000) ] }

try:
    client = aerospike.client(aero_config).connect()

except Exception as e:
    print(e)

@app.post('/')
def post_data(item: Item):
    pk = None
    try:
        i = item.dict()
        pk = str(shortuuid.uuid())
        key = ('test', 'pydemo', pk)
        client.put(key, i)
        return {"message": "Write success", "key": pk}
    except Exception as e:
        print("Error while inserting pk {} ::: {}".format(pk, e))
        return {"error": e}

@app.get('/')
def read_root(pk: str):
    try:
        key = ('test', 'demo', pk)
        _, _, result = client.get(key)
        return {"message": result}
    except Exception as e:
        return {"message": "error in read"}

TIL - on any public cloud infra when using public IPs to connect it’s best to specify the use-alternate-address and specify the public interface for Aerospike. I don’t know why this is the case, will dig in a bit more. So I changed the config on all 3 machines to

 service {
                address eth0
                port 3000
                alternate-access-address <public host IP>
        }

and in the client I specified 'use_services_alternate':True

Seems to work - At least am not seeing the partition errors that I saw earlier

This is because if the client cannot see the private IP, it would need the public one… the alternate-access-address allows you to specify that.