FAQ - Should parameters be set on the fly with the Aerospike Spark Connector?

The Aerospike Knowledge Base has moved to https://support.aerospike.com. Content on https://discuss.aerospike.com is being migrated to either https://support.aerospike.com or https://docs.aerospike.com. Maintenance on articles stored in this repository ceased on December 31st 2022 and this article may be stale. If you have any questions, please do not hesitate to raise a case via https://support.aerospike.com.

FAQ - Should parameters be set on the fly with the Aerospike Spark Connector?

Detail

There are certain Aerospike parameters that can be passed to a Spark session at any stage of session lifecycle, as long as it happens before using an Aerospike specific call. Examples of these parameters would be aerospike.seedhost, aerospike.port and aerospike.namespace. Should these be treated as parameters that can be set on the fly?

Answer

While it is possible to set these parameters during runtime, the pool of clients is instantiated from the settings and keyed by the hosts. When these settings are changed the pool will keep filling with more clients, which may not be desirable. For this reason, setting these parameters on the fly, while possible, is not recommended.

Keywords

SPARK CONNECTOR PARAMETER RUNTIME

Timestamp

November 2019

I fail to see how these parameters can be dynamic. These are all initialization parameters.

     val sqlContext = spark.sqlContext
     spark.conf.set("aerospike.seedhost", dbHost)
     spark.conf.set("aerospike.port", dbPort)
     spark.conf.set("aerospike.namespace",namespace)
     spark.conf.set("aerospike.set", dbSet)
     spark.conf.set("aerospike.keyPath", "/etc/aerospike/features.conf")
     spark.conf.set("aerospike.user", dbConnection)
     spark.conf.set("aerospike.password", dbPassword)
     spark.conf.set("aerospike.sendKey", true)

Also references has to be made to the features.conf (assuming this is enterprise edition) file that is specific to every aerospike instance and that has to exist in every host that is running Spark executable. Otherwise Spark job will fail.