We are generating medical statistics that show number of medical visits per practice and location. These are psv (pipe separated) files that have:
location, practice,physician,num_visits, charges, date_of_service
Each entry is a statistic collected per day, per physician per location, per practice and we want to run aggregate statistics on these.
We are going to bulk load these from psv (pipe separated values) file into Aerospike.
My question is about the design of keys and sets.
How do I key each entry here? These are already pre-aggregated values.
Can I store it all as one key (practice name) and store values as big psv file which I would later run aggregate queries on?
My typical queries might be:
- give me all daily visits for a physician in a practice in a location
- give me all applications of drugs per practice
- give me all charges per physician per location etc.
If this is too complicated to design, I might want to pull all values as json by the key, and do aggregations in the node application.