Skip to content
dkapoor edited this page May 19, 2014 · 15 revisions

Karma provides 2 APIs to generate RDFs in Batch mode. The Batch mode is meant for bulk use and can handle very large datasets.

OfflineRDFGenerator

This is a command line utility to load a model and a source, and then generate RDF. The source can be JSON, XML, CSV or database. With database, the API loads 10,000 rows at a time.

To generate RDF when the source is a file, go the the karma-offline sub-directory of Karma and execute the following command:

mvn exec:java -Dexec.mainClass="edu.isi.karma.rdf.OfflineRdfGenerator" -Dexec.args="--sourcetype 
<sourcetype> --filepath <filepath> --modelfilepath <modelfilepath> --sourcename <sourcename> --outputfile <outputfile>" -Dexec.classpathScope=compile

Example invocation for a JSON file:

mvn exec:java -Dexec.mainClass="edu.isi.karma.rdf.OfflineRdfGenerator" -Dexec.args="
--sourcetype JSON 
--filepath \"/files/data/wikipedia.json\" 
--modelfilepath \"/files/models/model-wikipedia.n3\" 
--sourcename wikipedia
--outputfile wikipedia-rdf.n3" -Dexec.classpathScope=compile

To generate RDF of a database table, go to the karma-offline subdirectory of Karma and run the following command from terminal:

mvn exec:java -Dexec.mainClass="edu.isi.karma.rdf.OfflineRdfGenerator" -Dexec.args="--sourcetype DB
--modelfilepath <modelfilepath> --outputfile <outputfile> --dbtype <dbtype> --hostname <hostname> 
--username <username> --password <password> --portnumber <portnumber> --dbname <dbname> --tablename <tablename>" -Dexec.classpathScope=compile

Valid argument values for dbtype are Oracle, MySQL, SQLServer, PostGIS, Sybase

Example invocation:

mvn exec:java -Dexec.mainClass="edu.isi.karma.rdf.OfflineRdfGenerator" -Dexec.args="
--sourcetype DB --dbtype SQLServer 
--hostname example.com --username root --password secret 
--portnumber 1433 --dbname Employees --tablename Person 
--modelfilepath \"/files/models/db-r2rml-model.ttl\"
--outputfile db-rdf.n3" -Dexec.classpathScope=compile

JSONRDFGenerator

This API is meant for repeated RDF generation from the same model. In this setting we load the models at the beginning and then every time the user does a query we use the model to generate RDF. This API currently only takes JSON as an input source.

edu.isi.karma.rdf.JSONRDFGenerator

API to add a model to the RDF Generator

// modelIdentifier : Provides a name and location of the model file
void addModel(R2RMLMappingIdentifier modelIdentifier); 

API to generate the RDF given a model name and json Data

//sourceName -> The name used for the model when added using the addModel API
//jsonData   -> The input json data
//addProvenance -> flag to indicate if provenance information should be added to the RDF
//pw -> Writer for the RDF output
void generateRDF(String sourceName, String jsonData, boolean addProvenance, PrintWriter pw)
   

Example use:

JSONRDFGenerator rdfGenerator = JSONRDFGenerator.getInstance();

//Construct a R2RMLMappingIdentifier that provides the location of the model and a name for the model and add the model to the JSONRDFGenerator. You can add multiple models using this API.
R2RMLMappingIdentifier modelIdentifier = new R2RMLMappingIdentifier(
				"people-model", new File("/files/models/people-model.ttl").toURI().toURL());
rdfGenerator.addModel(modelIdentifier);

String filename = "files/data/people.json";
String jsonData = EncodingDetector.getString(new File(filename),
					"utf-8");
StringWriter sw = new StringWriter();
PrintWriter pw = new PrintWriter(sw);
rdfGenerator.generateRDF("people-model", jsonData, true, pw);
String rdf = sw.toString();
System.out.println("Generated RDF: " + rdf);