High Latency with geo filtering(point in a polygon) at high throughput

geoindex
latency
secondary
query

#1

We are using Aerospike’s query feature with geo point in a polygon filter. WE have created GEO2DSPHERE secondary index on the bin, but we see that it is taking ~150ms for a single call. When run at throughput ~10k with concurrency ~80, the response time reached ~2 seconds and this keeps on increasing at higher load. The set that we are querying has no more than 50 records and geo bin contains upto 50 polygons each. Aerospike machine configuration- 3 boxes (2 core, 8Gb ram, 200Gb dp2 disk)

Please let us know if we are missing some configuration or if we can tune our Aerospike cluster better


#2
  • Is the latency also appearing inside the Aerospike histograms? (asadm -e show latency or AMC latency tab)
  • Do you have multiple clients making the same call? Are the latencies cumulative (does running 1x query from 2 boxes take 150ms, or does running 1x query from 2 boxes take 300ms?)

#3
  1. Is the latency also appearing inside the Aerospike histograms? (asadm -e show latency or AMC latency tab) - The latency is going high as we increase concurrency.

    concurrency - 2 query Latency Node Time Ops/Sec >1Ms >8Ms >64Ms . Span . . . . 172.28.128.3:3000 12:20:54->12:21:04 75.2 100.0 100.0 1.46 Number of rows: 1

    concurrency - 30 query Latency Node Time Ops/Sec >1Ms >8Ms >64Ms . Span . . . . 172.28.128.3:3000 11:33:41->11:33:51 43.1 100.0 100.0 100.0 Number of rows: 1

  2. Do you have multiple clients making the same call? Are the latencies cumulative - Yes, we have multiple clients making the same call. We are running a test with the same latitude longitude point which is searched in the same keyspace in same bin. Response time is increasing as concurrent requests increase


#4

In general the GEO2DSPHERE index may return a lot of false positives, with the decision on whether the polygons contain a specific point narrowed down post-retrieval in a fairly CPU intensive way. Larger regions are more susceptible to this outcome. 50 regions is a tiny number, and they’re likely to be very big.

One thing you should do is chop those polygons into smaller cells. That will lead to less overlap and a shorter computation time, lowering your latency.

Beyond that there’s configuration tuning and using an appropriate version. Which release of Aerospike CE are you using? Can you post the configuration you’re using, especially service and the namespace configuration for this specific namespace? (the entire config would help, if you’re comfortable with that). Not sure what a dp2 disk is. Are you on bare-metal or in a virtualized environment like Amazon EC2?