Replies: 3 comments 4 replies
-
Hey, thanks for interest in Is this a request about the Rust library or are we talking about the |
Beta Was this translation helpful? Give feedback.
-
I cannot share the JSON I have. However any big JSON would do. You can use one of the available open datasets. For example 340Mb JSON |
Beta Was this translation helpful? Give feedback.
-
I wrote a quick-and-dirty example of trying to run multiple path queries on the same file but using multithreading. I run this on the Steam JSON you provided and the following four queries:
The runtime on my machine is 77ms (2.02GB/s) on average, which is almost exactly the same as just running the last (heaviest) query, so parallelism is near-perfect. You can check sequential execution by changing the Unless you wish to run hundreds of queries at the same time simple parallalisation will most likely do the trick. Because of how query compilation works, trying to compile two queries into one engine would most likely be slower than just parallelizing them, since it would miss out on our SIMD skipping opportunities. That is at least until we figure out a way to capture filtering in our internal automaton, which could enable better optimizations. But there's a lot of theoretical work we need to do before we can implement that. Code```rust use clap::Parser; use rayon::prelude::*; use rsonpath::engine::error::EngineError; use rsonpath::{ engine::{Compiler, Engine, RsonpathEngine}, input::MmapInput, }; use std::sync::Arc; use std::time::Instant; use std::{error::Error, fs, process::ExitCode}; #[derive(Parser)] fn main() -> Result<ExitCode, Box> {
}
|
Beta Was this translation helpful? Give feedback.
-
Given an example JSON I would like to be able to take arbitrary structure and extract multiple paths to return results as a flat dictionary. It can be done outside the rsonpath by calling it multiple times, but it would be really inefficient. It would be way better if we would be able to issue API call and collect the results in one go without traversing the data multiple times.
This is an example of the result
The specification of the query could look like (I use JSON for demonstration purposes)
Beta Was this translation helpful? Give feedback.
All reactions