Storing tabular/PSV data in Aerospike

Edmon_Begoli · September 27, 2015, 4:26pm

We are generating medical statistics that show number of medical visits per practice and location. These are psv (pipe separated) files that have:

location, practice,physician,num_visits, charges, date_of_service

Each entry is a statistic collected per day, per physician per location, per practice and we want to run aggregate statistics on these.

We are going to bulk load these from psv (pipe separated values) file into Aerospike.

My question is about the design of keys and sets.

How do I key each entry here? These are already pre-aggregated values.

Can I store it all as one key (practice name) and store values as big psv file which I would later run aggregate queries on?

My typical queries might be:

give me all daily visits for a physician in a practice in a location
give me all applications of drugs per practice
give me all charges per physician per location etc.

If this is too complicated to design, I might want to pull all values as json by the key, and do aggregations in the node application.

helipilot50 · September 28, 2015, 5:13pm

Hi,

There is no simple answer as to the “best way” and it depends on what you want to query at speed and scale. Your data model will reflect how you want to read the data and at what latency and throughput.

If you want high speed (1-5ms latency) and high throughput (100k per second) of a particular piece of data, you will need to aggregate the data as you write it to Aerospike and store it using a composite key that will allow you to get that data quickly e.g. doctor-day-location.

If you want a statistical analysis over a period of time, and the query can take a few seconds to several minutes, then you can store the data in a less structured format and run Aerospike aggregations on it, or even use the Hadoop or Spark directly on the Aerospike data.

Regards

Mnemaudsyne · September 30, 2015, 12:47am

For more on this topic, please see the post from the original poster - and the community’s answers - on the Stack Overflow forum here.

Topic		Replies	Views
Composit Key Use Cases	17	4857	March 17, 2020
Write throughput on single key Tuning	5	1861	April 17, 2017
A few questions on aerospike query	11	2739	June 18, 2020
Composite keys for Aerospike query , index	2	2147	March 11, 2020
Can I use split keys in Aerospike? / Con Aerospike, puedo consultar claves parciales? How Aerospike Works	4	2253	December 2, 2015

Storing tabular/PSV data in Aerospike

Related topics