I am using java Aerospike-core-plugin-3.1.7 for development in eclipse
I am using a vangrnt box to run my server on the local host
What am I trying to do
I am trying to read all the records in a set using the scanAll() method of the AerospikeClient class
Questions
I havesome information and a few questions to ask. Please help
I dont find the API docs for 3.1.7 on the net to relate to a few classes which I dont see in the latest api java docs. Ex : I dont see a package aerospike.client.command in the latest API and so I dont see an Executor class in this package
So if I have to match the latest docs, I have to update my plugin to the latest one i,e 3.2.3 but I am not able to find it on the net and I am not able to update the existing plugin which I got from here
So, pushing all the version issues aside, the execution control waits indefinitely in the waitTillComplete() method which is called from the execute() method of the Executor class and further this execute() method is called from the scanAll() method of AerospikeClient class. Please help me to find out why is my control waiting indefinitely.
but it all started working fine after I restarted my system.
and now I am trying to scan all the records in the name space to determine “How many pages were hit a given number of times”
Input data file: This is a file that has hourly data for Wikipedia hits 200MB with 4 bins(1st bin is the language, 2nd bin is the title of the page retrieved, 3rd bin is the number of requests(hit count), 4th bin is the size of the content returned)
To do this:
Inserted the records into the AS cluster by creating sets based on bin 1 i.e each set is dedicated for a language
For each distinct page_title(2nd bin): filter all the records with this page title and then aggregate the 3rd bin to get the total hit count by passing the Result set stream to a simple Map: Reduce function for count
But this seems to take a lot of time to execute:
To be more specific with the algorithm:
// Parse all the set_names into a HashSet
sets = parseForSets(answer);
// for each set scan all the records
for (int index = 0; index < sets.size(); index++) {
client.scanAll(policy, "test", sets.get(index), this);
}
// For each record scanned in a specific set
@Override
public void scanCallback(Key key, Record record) throws AerospikeException {
// TODO Auto-generated method stub
// Query for the total count of page_hits(3rd Bin) FROM ALL THE SETS i.e ENTIRE NAMESPACE if the page_title(2nd Bin) of
// the scanned record is not present in a local HashSet
boolean added = pageTitlesSet.add(record.getString("page_title"));
if (added == true) {
getCount(sets, clienT, record.getString("page_title"));
}
}
Looking at all the above, I have the following questions
could the algorithm be improved in its design in any way?
Is there a way to read/ filter on a specific bin value all the records in a name space without mentioning a set_name?
@Override
public void scanCallback(Key key, Record record) throws AerospikeException {
// TODO Auto-generated method stub
// Query for the total count of page hits if the page_title(2nd Bin) of
// the scanned record is not present in a local HashSet
boolean added = pageTitlesSet.add(record.getString("page_title"));
if (added == true) {
getCount(sets, clienT, record.getString("page_title"));
} else {
System.out.println();
}
}
The server is not responding to the scan request. Are there any messages regarding the scan in the server log?
Also, it’s not clear why you are opening/closing the client just to run a scan. A single client instance should remain open for the duration of the program.
Do the scan examples in the examples directory work for you?
The server is not responding to the scan request. Are there any messages regarding the scan in the server log?
There are no messages regarding the scan in the server log
Also, it’s not clear why you are opening/closing the client just to run a scan.
I am just trying to scan all the records and in the scanCallBack function I am trying to output the records whose bin 2 starts with a specific sequence of characters
A single client instance should remain open for the duration of the program.
Sure, I am maintaining only a single instance of the client and the program ends after the scanAll, so I close the client at the end
Do the scan examples in the examples directory work for you?