NullPointerException when reading via aerospark

I’m very new to aerospike (and fairly new to spark) and am trying to get the two working together, but having some issues…

Since I was originally building the aerospark jar away from the db server I built with tests disabled. When I ran my test code (just the code below from the project readme) I was able to successfully import data but would get a null pointer exception when trying to read it back (with aql I can verify that the data has indeed been saved). The code as it stands is:

Save Data:

val TEST_COUNT = 100
val namespace = "test"
var client = AerospikeConnection.getClient("localhost", 3000)
Value.UseDoubleType = true
for (i <- 1 to TEST_COUNT) {
  val key = new Key(namespace, "rdd-test", "rdd-test-"+i)
  client.put(null, key,
     new Bin("one", i),
     new Bin("two", "two:"+i),
     new Bin("three", i.toDouble)

and try to read something back:

val thingsDF =
        option("aerospike.seedhost", "").
        option("aerospike.port", "3000").
        option("aerospike.namespace", namespace).
        option("aerospike.set", "rdd-test").
val filteredThings = sqlContext.sql("select * from things where one = 55")
val thing = filteredThings.first()

Figuring that I had better validate the basic setup I tried building the jar file on the database machine with tests enabled. It turns out I get the same exception. Here’s the first one from the log:

[info] AerospikeRelationTest: [info] Aerospike Relation [info] - should create test data [info] - should create an AerospikeRelation *** FAILED *** [info] org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.NullPointerException [info] at com.aerospike.helper.model.Set.setInfo( [info] at com.aerospike.helper.model.Set.( [info] at com.aerospike.helper.model.Namespace.mergeSet( [info] at com.aerospike.helper.query.QueryEngine.refreshNamespaceData( [info] at com.aerospike.helper.query.QueryEngine.refreshNamespaces( [info] at com.aerospike.helper.query.QueryEngine.refreshCluster( [info] at com.aerospike.helper.query.QueryEngine.setClient( [info] at com.aerospike.helper.query.QueryEngine.( [info] at com.aerospike.spark.sql.AerospikeConnection$$anonfun$getQueryEngine$1.apply(AerospikeConnection.scala:22) [info] at com.aerospike.spark.sql.AerospikeConnection$$anonfun$getQueryEngine$1.apply(AerospikeConnection.scala:21) [info] at scala.collection.MapLike$class.getOrElse(MapLike.scala:128) [info] at scala.collection.AbstractMap.getOrElse(Map.scala:58) [info] at com.aerospike.spark.sql.AerospikeConnection$.getQueryEngine(AerospikeConnection.scala:21) [info] at com.aerospike.spark.sql.KeyRecordRDD.compute(KeyRecordRDD.scala:77) [info] at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) [info] at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) [info] at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) [info] at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) [info] at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) [info] at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) [info] at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) [info] at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) [info] at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) [info] at [info] at org.apache.spark.executor.Executor$ [info] at java.util.concurrent.ThreadPoolExecutor.runWorker( [info] at java.util.concurrent.ThreadPoolExecutor$ [info] at [info] [info] Driver stacktrace: [info] at$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431) [info] at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419) [info] at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418) [info] at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) [info] at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) [info] at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418) [info] at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) [info] at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) [info] at scala.Option.foreach(Option.scala:236) [info] at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799) [info] … [info] Cause: java.lang.NullPointerException: [info] at com.aerospike.helper.model.Set.setInfo( [info] at com.aerospike.helper.model.Set.( [info] at com.aerospike.helper.model.Namespace.mergeSet( [info] at com.aerospike.helper.query.QueryEngine.refreshNamespaceData( [info] at com.aerospike.helper.query.QueryEngine.refreshNamespaces( [info] at com.aerospike.helper.query.QueryEngine.refreshCluster( [info] at com.aerospike.helper.query.QueryEngine.setClient( [info] at com.aerospike.helper.query.QueryEngine.( [info] at com.aerospike.spark.sql.AerospikeConnection$$anonfun$getQueryEngine$1.apply(AerospikeConnection.scala:22) [info] at com.aerospike.spark.sql.AerospikeConnection$$anonfun$getQueryEngine$1.apply(AerospikeConnection.scala:21) [info] …

I’m assuming that there must be something wrong in my environment. I’m using sbt 0.13, scala 2.10 and java 1.8.0_91. I’m seeing the same issue on both Mac OS X and linux (ubuntu 14.04). Might anyone have any suggestions as to what my issue might be?

Are you using aerospike/aerospark, or the older one?

I’m using aerospike/aerospark.


Some digging here and I found that the issue looks to be in the aerospike helper. The name of the set passed down into looks to be keyed via “set” rather than “set_name”.

This change fixes the aerospark test suite as well as my own code (well, until I hit the next bug - which at this point looks to be clearly mine).

`diff --git a/java/src/main/java/com/aerospike/helper/model/ b/java/src/main/java/com/aerospike/helper/model/ index 1339d60…8a7c892 100755 — a/java/src/main/java/com/aerospike/helper/model/ +++ b/java/src/main/java/com/aerospike/helper/model/ @@ -72,7 +72,7 @@ public class Set { storedValue.value = value; } }

  •          = (String) values.get("set_name").value;
  •          = (String) values.get("set").value;


I’ll submit a pull request for the above change.

Thanks, appreciate the pull request!

NullPointerExceptions are exceptions that occur when you try to use a reference that points to no location in memory (null) as though it were referencing an object. Calling a method on a null reference or trying to access a field of a null reference will trigger a NullPointerException. More about…Java NullPointerException
