Skip to content

AutomataLab/JSONSki_nodejs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

2992784 · Mar 16, 2023

History

85 Commits
Mar 16, 2023
Nov 15, 2022
Nov 15, 2022
Nov 15, 2022
Nov 15, 2022
Nov 19, 2022
Nov 15, 2022
Nov 16, 2022
Mar 16, 2023
Mar 9, 2023
Nov 15, 2022
Nov 15, 2022
Nov 15, 2022
Nov 15, 2022
Feb 17, 2023

Repository files navigation

CircleCI GitHub npm GitHub code size in bytes

JSONSki_NodeJs

JSONSki_Nodejs is the Node.Js (Javascript) binding port for JSONSki

JSONSki is a streaming JSONPath processor with fast-forward functionality. During the streaming, it can automatically fast-forward over certain JSON substructures that are irrelavent to the query evaluation, without parsing them in detail. To make the fast-forward efficient, JSONSki features a highly bit-parallel solution that intensively utilizes bitwise and SIMD operations that are prevelent on modern CPUs to implement the fast-forward APIs.

NPM Package

You can download the npm package from here - https://www.npmjs.com/package/jsonski

Installation

npm i jsonski

Quick Start

const JSki = require('jsonski')
console.log(JSki.JSONSkiParser("$.features[150].actor.login", "datasets/test.json"));
  • We interface the following method:
JSki.JSONSkiParser(args1, args2)    //args1 - String(query) and args2 - String(file_location)

Requirements

Hardware requirements

  • CPUs: 64-bit ALU instructions, 256-bit SIMD instruction set, and the carry-less multiplication instruction (pclmulqdq)
  • Operating System: Linux, MacOs (Intel Chips only)
  • C++ Compiler: g++ (7.4.0 or higher)

Software requirements

Before starting to use Node-API you need to assure you have the following prerequisites:

Getting Started with Querying using JSONSki

JSONPath

JSONPath is the basic query language of JSON data. It refers to substructures of JSON data in a similar way as XPath queries are used for XML data. For the details of JSONPath syntax, please refer to Stefan Goessner's article.

JSONSki Queries Operators

Operator Description
$ root object
. child object
[] child array
* wildcard, all objects or array members
[index] array index
[start:end] array slice operator

Path Examples

Consider a piece of geo-referenced tweet in JSON

{
    "coordinates": [
        40.74118764, -73.9998279
    ],
    "user": {
        "id": 6253282
    },
    "place": {
        "name": "Manhattan",
        "bounding_box": {
            "type": "Ploygon",
            "pos": [
                [-74.026675, 40.683935],
                ......
            ]
        }
    }
}
JsonPath Result
$.coordinates[*] all coordinates
$.place.name place name
$.place.bounding_box.pos[0] first position of the bounding box in place
$.place.bounding_box.pos[0:2] first two positions of the bounding box in place

Performance Comparison with Javascript Parsing

Below is an example usage of Jsonski npm package.

const JSki = require('jsonski')
const fs = require('fs');
console.time();
console.log('JsonSki Runtime', JSki.JSONSkiParser("$[*].entities.urls[*].url", "dataset/twitter_sample_large_record.json"));
console.timeEnd();

file_contents = fs.readFileSync('dataset/twitter_sample_large_record.json')
str = file_contents.toString()
console.log("Javascript Runtime")
console.time();
var json = JSON.parse(str);
console.timeEnd();
  • Note: The code snippet above benchmarks performance for Javascript parsing VS JSONSki_nodejs parsing.

Publication

[1] Lin Jiang and Zhijia Zhao. JSONSki: Streaming Semi-structured Data with Bit-Parallel Fast-Forwarding. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022.

@inproceedings{jsonski,
  title={JSONSki: Streaming Semi-structured Data with Bit-Parallel Fast-Forwarding},
  author={Lin Jiang and Zhijia Zhao},
  booktitle={Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)},
  year={2022}
}

Performance

image

Benchmarking

Performance of JSONSki_nodejs is compared with simdjson_nodejs and Javascript Parsing - https://github.com/AutomataLab/NPM-JSON-Parser-Benchmarking