High server CPU usage on google cloud

I’m totally new to aerospike, but I’ve been playing around with it for a month or two on Google Compute Engine (GCE), using a single local SSD. Right now the server is just running on a single node with replication factor 1. Here’s my config:

service {
user root
group root
paxos-single-replica-limit 1
pidfile /var/run/aerospike/asd.pid
service-threads 8
transaction-queues 8
transaction-threads-per-queue 8
proto-fd-max 15000
}

The hardware is an ‘n1-highmem-4’ (4 cores, 26GB RAM).

Every time I examine CPU usage (via ‘top’, for instance), asd is pegged at 100%, more if I have an operation running. Is this expected? It’s just me running a few requests every now and then, so there should be basically zero load. I’ve restarted the server once or twice but I this doesn’t seem to help. Any ideas for what the issue might be or tips for what to investigate?

You are right in that there should be almost zero load if there are no operations running.

  1. Can you run the “top” command and then press “1” - so that it shows the individual CPU details. It will be great if you attach the screenshot.
  2. Would like to see if there is anything unusual in the log (/var/log/aerospike/aerospike.log) Can you put a snippet here.
  3. Have you done any custom configurations (say “nsup-period”)? Can you share the complete config file.

Hi there - looks like I can’t upload an image since I’m a “new user”, but here’s the text of ‘top’:

top - 21:12:06 up 36 days,  6:15,  5 users,  load average: 1.01, 1.03, 1.00
Tasks: 131 total,   1 running, 130 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.7%us,  0.3%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.3%us,  0.3%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  : 97.7%us,  2.3%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  26676756k total, 23843508k used,  2833248k free,   180264k buffers
Swap:        0k total,        0k used,        0k free,  4398540k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 6581 root      20   0 17.0g  14g 1172 S 101.4 55.8  36478:46 asd
 1388 root      20   0 90188 8528 2472 S  0.3  0.0   4:56.10 manage_addresse
13394 brendano  20   0 15036 1236  936 R  0.3  0.0   0:00.30 top
    1 root      20   0 19336 1416 1064 S  0.0  0.0   0:08.13 init

And here are the last 100 lines of /var/log/aerospike/aerospike.log …

Apr 13 2015 21:13:34 GMT: INFO (namespace): (namespace.c::444) {annotation} hwm_breached true (memory), stop_writes false, memory sz:14760036032 (14760036032 + 0) hwm:10307921920 sw:15461881856, disk sz:82695996544 hwm:201326592000
Apr 13 2015 21:13:35 GMT: INFO (drv_ssd): (drv_ssd.c::2398) device /dev/sdc: used 82695996544, contig-free 289305M (2314441 wblocks), swb-free 2, n-w 0, w-q 0 w-tot 3442870 (0.0/s), defrag-q 0 defrag-tot 2684002 (0.0/s)
Apr 13 2015 21:13:42 GMT: INFO (info): (thr_info.c::4565)  system memory: free 8126492kb ( 30 percent free )
Apr 13 2015 21:13:42 GMT: INFO (info): (thr_info.c::4573)  migrates in progress ( 0 , 0 ) ::: ClusterSize 1 ::: objects 230625563 ::: sub_objects 0
Apr 13 2015 21:13:42 GMT: INFO (info): (thr_info.c::4581)  rec refs 230649241 ::: rec locks 0 ::: trees 0 ::: wr reqs 0 ::: mig tx 0 ::: mig rx 0
Apr 13 2015 21:13:42 GMT: INFO (info): (thr_info.c::4587)  replica errs :: null 0 non-null 0 ::: sync copy errs :: node 0 :: master 0
Apr 13 2015 21:13:42 GMT: INFO (info): (thr_info.c::4597)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (6, 639, 633) : hb (0, 0, 0) : fab (16, 16, 0)
Apr 13 2015 21:13:42 GMT: INFO (info): (thr_info.c::4599)    heartbeat_received: self 20538694 : foreign 0
Apr 13 2015 21:13:42 GMT: INFO (info): (thr_info.c::4600)    heartbeat_stats: bt 0 bf 0 nt 0 ni 0 nn 0 nnir 0 nal 0 sf1 0 sf2 0 sf3 0 sf4 0 sf5 0 sf6 0 mrf 0 eh 0 efd 0 efa 0 um 0 mcf 0 rc 0
Apr 13 2015 21:13:42 GMT: INFO (info): (thr_info.c::4613)    tree_counts: nsup 1 scan 0 batch 0 dup 0 wprocess 0 migrx 0 migtx 0 ssdr 0 ssdw 0 rw 0
Apr 13 2015 21:13:42 GMT: INFO (info): (thr_info.c::4654) namespace annotation: disk inuse: 82695996544 memory inuse: 14760036032 (bytes) sindex memory inuse: 0 (bytes) avail pct 75 cache-read pct 0.00
Apr 13 2015 21:13:42 GMT: INFO (info): (thr_info.c::4674)    partitions: actual 4096 sync 0 desync 0 zombie 0 wait 0 absent 0
Apr 13 2015 21:13:42 GMT: INFO (info): (hist.c::137) histogram dump: reads (643 total) msec
Apr 13 2015 21:13:42 GMT: INFO (info): (hist.c::163)  (00: 0000000642) (01: 0000000001)
Apr 13 2015 21:13:42 GMT: INFO (info): (hist.c::137) histogram dump: writes_master (793872313 total) msec
Apr 13 2015 21:13:42 GMT: INFO (info): (hist.c::154)  (00: 0791094034) (01: 0002114849) (02: 0000538178) (03: 0000083753)
Apr 13 2015 21:13:42 GMT: INFO (info): (hist.c::154)  (04: 0000039061) (05: 0000001434) (06: 0000000558) (07: 0000000232)
Apr 13 2015 21:13:42 GMT: INFO (info): (hist.c::163)  (08: 0000000193) (09: 0000000021)
Apr 13 2015 21:13:42 GMT: INFO (info): (hist.c::137) histogram dump: proxy (0 total) msec
Apr 13 2015 21:13:42 GMT: INFO (info): (hist.c::137) histogram dump: writes_reply (796381248 total) msec
Apr 13 2015 21:13:42 GMT: INFO (info): (hist.c::154)  (00: 0793607775) (01: 0002110914) (02: 0000537374) (03: 0000083699)
Apr 13 2015 21:13:42 GMT: INFO (info): (hist.c::154)  (04: 0000039050) (05: 0000001432) (06: 0000000558) (07: 0000000232)
Apr 13 2015 21:13:42 GMT: INFO (info): (hist.c::163)  (08: 0000000193) (09: 0000000021)
Apr 13 2015 21:13:42 GMT: INFO (info): (hist.c::137) histogram dump: udf (0 total) msec
Apr 13 2015 21:13:42 GMT: INFO (info): (hist.c::137) histogram dump: query (0 total) msec
Apr 13 2015 21:13:42 GMT: INFO (info): (hist.c::137) histogram dump: query_rec_count (0 total) count
Apr 13 2015 21:13:52 GMT: INFO (info): (thr_info.c::4565)  system memory: free 8126376kb ( 30 percent free )
Apr 13 2015 21:13:52 GMT: INFO (info): (thr_info.c::4573)  migrates in progress ( 0 , 0 ) ::: ClusterSize 1 ::: objects 230625563 ::: sub_objects 0
Apr 13 2015 21:13:52 GMT: INFO (info): (thr_info.c::4581)  rec refs 230633145 ::: rec locks 0 ::: trees 0 ::: wr reqs 0 ::: mig tx 0 ::: mig rx 0
Apr 13 2015 21:13:52 GMT: INFO (info): (thr_info.c::4587)  replica errs :: null 0 non-null 0 ::: sync copy errs :: node 0 :: master 0
Apr 13 2015 21:13:52 GMT: INFO (info): (thr_info.c::4597)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (6, 639, 633) : hb (0, 0, 0) : fab (16, 16, 0)
Apr 13 2015 21:13:52 GMT: INFO (info): (thr_info.c::4599)    heartbeat_received: self 20538760 : foreign 0
Apr 13 2015 21:13:52 GMT: INFO (info): (thr_info.c::4600)    heartbeat_stats: bt 0 bf 0 nt 0 ni 0 nn 0 nnir 0 nal 0 sf1 0 sf2 0 sf3 0 sf4 0 sf5 0 sf6 0 mrf 0 eh 0 efd 0 efa 0 um 0 mcf 0 rc 0
Apr 13 2015 21:13:52 GMT: INFO (info): (thr_info.c::4613)    tree_counts: nsup 1 scan 0 batch 0 dup 0 wprocess 0 migrx 0 migtx 0 ssdr 0 ssdw 0 rw 0
Apr 13 2015 21:13:52 GMT: INFO (info): (thr_info.c::4654) namespace annotation: disk inuse: 82695996544 memory inuse: 14760036032 (bytes) sindex memory inuse: 0 (bytes) avail pct 75 cache-read pct 0.00
Apr 13 2015 21:13:52 GMT: INFO (info): (thr_info.c::4674)    partitions: actual 4096 sync 0 desync 0 zombie 0 wait 0 absent 0
Apr 13 2015 21:13:52 GMT: INFO (info): (hist.c::137) histogram dump: reads (643 total) msec
Apr 13 2015 21:13:52 GMT: INFO (info): (hist.c::163)  (00: 0000000642) (01: 0000000001)
Apr 13 2015 21:13:52 GMT: INFO (info): (hist.c::137) histogram dump: writes_master (793872313 total) msec
Apr 13 2015 21:13:52 GMT: INFO (info): (hist.c::154)  (00: 0791094034) (01: 0002114849) (02: 0000538178) (03: 0000083753)
Apr 13 2015 21:13:52 GMT: INFO (info): (hist.c::154)  (04: 0000039061) (05: 0000001434) (06: 0000000558) (07: 0000000232)
Apr 13 2015 21:13:52 GMT: INFO (info): (hist.c::163)  (08: 0000000193) (09: 0000000021)
Apr 13 2015 21:13:52 GMT: INFO (info): (hist.c::137) histogram dump: proxy (0 total) msec
Apr 13 2015 21:13:52 GMT: INFO (info): (hist.c::137) histogram dump: writes_reply (796381248 total) msec
Apr 13 2015 21:13:52 GMT: INFO (info): (hist.c::154)  (00: 0793607775) (01: 0002110914) (02: 0000537374) (03: 0000083699)
Apr 13 2015 21:13:52 GMT: INFO (info): (hist.c::154)  (04: 0000039050) (05: 0000001432) (06: 0000000558) (07: 0000000232)
Apr 13 2015 21:13:52 GMT: INFO (info): (hist.c::163)  (08: 0000000193) (09: 0000000021)
Apr 13 2015 21:13:52 GMT: INFO (info): (hist.c::137) histogram dump: udf (0 total) msec
Apr 13 2015 21:13:52 GMT: INFO (info): (hist.c::137) histogram dump: query (0 total) msec
Apr 13 2015 21:13:52 GMT: INFO (info): (hist.c::137) histogram dump: query_rec_count (0 total) count
Apr 13 2015 21:13:55 GMT: INFO (drv_ssd): (drv_ssd.c::2398) device /dev/sdc: used 82695996544, contig-free 289305M (2314441 wblocks), swb-free 2, n-w 0, w-q 0 w-tot 3442870 (0.0/s), defrag-q 0 defrag-tot 2684002 (0.0/s)
Apr 13 2015 21:14:02 GMT: INFO (info): (thr_info.c::4565)  system memory: free 8126940kb ( 30 percent free )
Apr 13 2015 21:14:02 GMT: INFO (info): (thr_info.c::4573)  migrates in progress ( 0 , 0 ) ::: ClusterSize 1 ::: objects 230625563 ::: sub_objects 0
Apr 13 2015 21:14:02 GMT: INFO (info): (thr_info.c::4581)  rec refs 230657044 ::: rec locks 0 ::: trees 0 ::: wr reqs 0 ::: mig tx 0 ::: mig rx 0
Apr 13 2015 21:14:02 GMT: INFO (info): (thr_info.c::4587)  replica errs :: null 0 non-null 0 ::: sync copy errs :: node 0 :: master 0
Apr 13 2015 21:14:02 GMT: INFO (info): (thr_info.c::4597)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (6, 639, 633) : hb (0, 0, 0) : fab (16, 16, 0)
Apr 13 2015 21:14:02 GMT: INFO (info): (thr_info.c::4599)    heartbeat_received: self 20538827 : foreign 0
Apr 13 2015 21:14:02 GMT: INFO (info): (thr_info.c::4600)    heartbeat_stats: bt 0 bf 0 nt 0 ni 0 nn 0 nnir 0 nal 0 sf1 0 sf2 0 sf3 0 sf4 0 sf5 0 sf6 0 mrf 0 eh 0 efd 0 efa 0 um 0 mcf 0 rc 0
Apr 13 2015 21:14:02 GMT: INFO (info): (thr_info.c::4613)    tree_counts: nsup 1 scan 0 batch 0 dup 0 wprocess 0 migrx 0 migtx 0 ssdr 0 ssdw 0 rw 0
Apr 13 2015 21:14:02 GMT: INFO (info): (thr_info.c::4654) namespace annotation: disk inuse: 82695996544 memory inuse: 14760036032 (bytes) sindex memory inuse: 0 (bytes) avail pct 75 cache-read pct 0.00
Apr 13 2015 21:14:02 GMT: INFO (info): (thr_info.c::4674)    partitions: actual 4096 sync 0 desync 0 zombie 0 wait 0 absent 0
Apr 13 2015 21:14:02 GMT: INFO (info): (hist.c::137) histogram dump: reads (643 total) msec
Apr 13 2015 21:14:02 GMT: INFO (info): (hist.c::163)  (00: 0000000642) (01: 0000000001)
Apr 13 2015 21:14:02 GMT: INFO (info): (hist.c::137) histogram dump: writes_master (793872313 total) msec
Apr 13 2015 21:14:02 GMT: INFO (info): (hist.c::154)  (00: 0791094034) (01: 0002114849) (02: 0000538178) (03: 0000083753)
Apr 13 2015 21:14:02 GMT: INFO (info): (hist.c::154)  (04: 0000039061) (05: 0000001434) (06: 0000000558) (07: 0000000232)
Apr 13 2015 21:14:02 GMT: INFO (info): (hist.c::163)  (08: 0000000193) (09: 0000000021)
Apr 13 2015 21:14:02 GMT: INFO (info): (hist.c::137) histogram dump: proxy (0 total) msec
Apr 13 2015 21:14:02 GMT: INFO (info): (hist.c::137) histogram dump: writes_reply (796381248 total) msec
Apr 13 2015 21:14:02 GMT: INFO (info): (hist.c::154)  (00: 0793607775) (01: 0002110914) (02: 0000537374) (03: 0000083699)
Apr 13 2015 21:14:02 GMT: INFO (info): (hist.c::154)  (04: 0000039050) (05: 0000001432) (06: 0000000558) (07: 0000000232)
Apr 13 2015 21:14:02 GMT: INFO (info): (hist.c::163)  (08: 0000000193) (09: 0000000021)
Apr 13 2015 21:14:02 GMT: INFO (info): (hist.c::137) histogram dump: udf (0 total) msec
Apr 13 2015 21:14:02 GMT: INFO (info): (hist.c::137) histogram dump: query (0 total) msec
Apr 13 2015 21:14:02 GMT: INFO (info): (hist.c::137) histogram dump: query_rec_count (0 total) count
Apr 13 2015 21:14:12 GMT: INFO (info): (thr_info.c::4565)  system memory: free 8126728kb ( 30 percent free )
Apr 13 2015 21:14:12 GMT: INFO (info): (thr_info.c::4573)  migrates in progress ( 0 , 0 ) ::: ClusterSize 1 ::: objects 230625563 ::: sub_objects 0
Apr 13 2015 21:14:12 GMT: INFO (info): (thr_info.c::4581)  rec refs 230664854 ::: rec locks 0 ::: trees 0 ::: wr reqs 0 ::: mig tx 0 ::: mig rx 0
Apr 13 2015 21:14:12 GMT: INFO (info): (thr_info.c::4587)  replica errs :: null 0 non-null 0 ::: sync copy errs :: node 0 :: master 0
Apr 13 2015 21:14:12 GMT: INFO (info): (thr_info.c::4597)    trans_in_progress: wr 0 prox 0 wait 0 ::: q 0 ::: bq 0 ::: iq 0 ::: dq 0 : fds - proto (6, 639, 633) : hb (0, 0, 0) : fab (16, 16, 0)
Apr 13 2015 21:14:12 GMT: INFO (info): (thr_info.c::4599)    heartbeat_received: self 20538893 : foreign 0
Apr 13 2015 21:14:12 GMT: INFO (info): (thr_info.c::4600)    heartbeat_stats: bt 0 bf 0 nt 0 ni 0 nn 0 nnir 0 nal 0 sf1 0 sf2 0 sf3 0 sf4 0 sf5 0 sf6 0 mrf 0 eh 0 efd 0 efa 0 um 0 mcf 0 rc 0
Apr 13 2015 21:14:12 GMT: INFO (info): (thr_info.c::4613)    tree_counts: nsup 1 scan 0 batch 0 dup 0 wprocess 0 migrx 0 migtx 0 ssdr 0 ssdw 0 rw 0
Apr 13 2015 21:14:12 GMT: INFO (info): (thr_info.c::4654) namespace annotation: disk inuse: 82695996544 memory inuse: 14760036032 (bytes) sindex memory inuse: 0 (bytes) avail pct 75 cache-read pct 0.00
Apr 13 2015 21:14:12 GMT: INFO (info): (thr_info.c::4674)    partitions: actual 4096 sync 0 desync 0 zombie 0 wait 0 absent 0
Apr 13 2015 21:14:12 GMT: INFO (info): (hist.c::137) histogram dump: reads (643 total) msec
Apr 13 2015 21:14:12 GMT: INFO (info): (hist.c::163)  (00: 0000000642) (01: 0000000001)
Apr 13 2015 21:14:12 GMT: INFO (info): (hist.c::137) histogram dump: writes_master (793872313 total) msec
Apr 13 2015 21:14:12 GMT: INFO (info): (hist.c::154)  (00: 0791094034) (01: 0002114849) (02: 0000538178) (03: 0000083753)
Apr 13 2015 21:14:12 GMT: INFO (info): (hist.c::154)  (04: 0000039061) (05: 0000001434) (06: 0000000558) (07: 0000000232)
Apr 13 2015 21:14:12 GMT: INFO (info): (hist.c::163)  (08: 0000000193) (09: 0000000021)
Apr 13 2015 21:14:12 GMT: INFO (info): (hist.c::137) histogram dump: proxy (0 total) msec
Apr 13 2015 21:14:12 GMT: INFO (info): (hist.c::137) histogram dump: writes_reply (796381248 total) msec
Apr 13 2015 21:14:12 GMT: INFO (info): (hist.c::154)  (00: 0793607775) (01: 0002110914) (02: 0000537374) (03: 0000083699)
Apr 13 2015 21:14:12 GMT: INFO (info): (hist.c::154)  (04: 0000039050) (05: 0000001432) (06: 0000000558) (07: 0000000232)
Apr 13 2015 21:14:12 GMT: INFO (info): (hist.c::163)  (08: 0000000193) (09: 0000000021)
Apr 13 2015 21:14:12 GMT: INFO (info): (hist.c::137) histogram dump: udf (0 total) msec
Apr 13 2015 21:14:12 GMT: INFO (info): (hist.c::137) histogram dump: query (0 total) msec
Apr 13 2015 21:14:12 GMT: INFO (info): (hist.c::137) histogram dump: query_rec_count (0 total) count
Apr 13 2015 21:14:15 GMT: INFO (drv_ssd): (drv_ssd.c::2398) device /dev/sdc: used 82695996544, contig-free 289305M (2314441 wblocks), swb-free 2, n-w 0, w-q 0 w-tot 3442870 (0.0/s), defrag-q 0 defrag-tot 2684002 (0.0/s)

Lastly, I haven’t done any custom configurations, but the initial setup was using GCE’s “click to deploy” feature - maybe there are some funky defaults in there?

Thanks for the help!

It appears that the high-water-mark has been reached for you namespace, “annotation” - from the first line of your log extract.

Get the stats for your namespace using:

asinfo -h <host ip> -v 'namespace/<namespace name>'

and check the hwm-breached parameter.

Read more at: http://www.aerospike.com/docs/operations/manage/storage

Great, thank you, I hadn’t noticed that. My objects are set to never expire, so when the HWM is passed, does asd kick off a process to try to evict objects, but there are none that can be evicted, so it just searches indefinitely without finding anything? Thanks again, this is really helpful - just trying to understand how things work here.

1 Like