Aerospike GraphDB

Hi I am trying to use your graph db for a project that uses in Apache jena to load RDF files in text format … the first thing I want to do is to change the RDF TTL (wikidump) format to CSV … is there any tool I can use to do that … Please advise thanks

Hello,

There is no off-the-shelf RDF to CSV converter that I know of. However, the transformation code would be fairly easy to write. The theory is that you are converting an edge list (RDF) to an adjacency list (CSV), where the edge list is “long” and the adjacency list is “wide”. Assuming your RDF data set can fit in memory, a simple algorithm would be:

adjacency_list = [:]
for(statement : edge_list) { // statement is the triple (subject--predicate-->object)
  row = adjacency_list[statement.subject]; // where row is the map [<column:string>:<entry:any>]
  row[statement.predicate] = statement.object;
}

You would then iterate through your adjacency_list data structure and write each row to file where the row entries are comma-separated.

Note that no distinction between property and edge relations is made in the snippet above. The quick and dirty way to make this distinction is based on whether the object of the statement is a value or a URI. If it’s a URI, it’s an edge. If it’s a value, it’s a property.

HTH, Marko.

2 Likes