-
Notifications
You must be signed in to change notification settings - Fork 196
Batch Mode
Karma provides 2 APIs to generate RDFs in Batch mode. The Batch mode is meany for bulk use and can handle very large datasets.
This is a command line utility to load a model and a source, and then generate RDF. The source can be JSON, XML, CSV or database. With database, the API loads 10,000 rows at a time.
To generate RDF when the source is a file, go the the root directory of Karma and execute the following command:
mvn exec:java -Dexec.mainClass="edu.isi.karma.rdf.OfflineRdfGenerator" -Dexec.args="--sourcetype
<sourcetype> --filepath <filepath> --modelfilepath <modelfilepath> --outputfile <outputfile>" -Dexec.classpathScope=compile
Example invocation for a JSON file:
mvn exec:java -Dexec.mainClass="edu.isi.karma.rdf.OfflineRdfGenerator" -Dexec.args="
--sourcetype JSON
--filepath \"/Users/shubhamgupta/Documents/wikipedia.json\"
--modelfilepath \"/Users/shubhamgupta/Documents/model-wikipedia.n3\"
--outputfile wikipedia-rdf.n3" -Dexec.classpathScope=compile
To generate RDF of a database table, go to the top level Karma directory and run the following command from terminal:
mvn exec:java -Dexec.mainClass="edu.isi.karma.rdf.OfflineRdfGenerator" -Dexec.args="--sourcetype DB
--modelfilepath <modelfilepath> --outputfile <outputfile> --dbtype <dbtype> --hostname <hostname>
--username <username> --password <password> --portnumber <portnumber> --dbname <dbname> --tablename <tablename>" -Dexec.classpathScope=compile
Valid argument values for dbtype
are Oracle, MySQL, SQLServer, PostGIS, Sybase
Example invocation:
mvn exec:java -Dexec.mainClass="edu.isi.karma.rdf.OfflineRdfGenerator" -Dexec.args="
--sourcetype DB --dbtype SQLServer
--hostname example.com --username root --password secret
--portnumber 1433 --dbname Employees --tablename Person
--modelfilepath \"/Users/shubhamgupta/Documents/db-r2rml-model.ttl\"
--outputfile db-rdf.n3" -Dexec.classpathScope=compile
This API is meant for repeated RDF generation from the same model. In this setting we load the models at the beginning and then every time the user does a query we use the model to generate RDF. This API currently only takes JSON as an input source.
edu.isi.karma.rdf.JSONRDFGenerator
API to add a model to the RDF Generator
// modelIdentifier : Provides a name and location of the model file
void addModel(R2RMLMappingIdentifier modelIdentifier);
API to generate the RDF given a model name and json Data
//sourceName -> The name used for the model when added using the addModel API
//jsonData -> The input json data
//addProvenance -> flag to indicate if provenance information should be added to the RDF
//pw -> Writer for the RDF output
void generateRDF(String sourceName, String jsonData, boolean addProvenance, PrintWriter pw)
Example use:
JSONRDFGenerator rdfGenerator = JSONRDFGenerator.getInstance();
//Construct a R2RMLMappingIdentifier that provides the location of the model and a name for the model and add the model to the JSONRDFGenerator. You can add multiple models using this API.
R2RMLMappingIdentifier modelIdentifier = new R2RMLMappingIdentifier(
"people-model", new File("/files/models/people-model.ttl").toURI().toURL());
rdfGenerator.addModel(modelIdentifier);
String filename = "files/data/people.json";
String jsonData = EncodingDetector.getString(new File(filename),
"utf-8");
StringWriter sw = new StringWriter();
PrintWriter pw = new PrintWriter(sw);
rdfGenerator.generateRDF("people-model", jsonData, true, pw);
String rdf = sw.toString();
System.out.println("Generated RDF: " + rdf);