Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example doesnt work #465

Closed
vaguue opened this issue Feb 27, 2024 · 18 comments
Closed

Example doesnt work #465

vaguue opened this issue Feb 27, 2024 · 18 comments

Comments

@vaguue
Copy link

vaguue commented Feb 27, 2024

I'm following the steps from README.md and getting this error

file:///Users/seva/seva/node_modules/parquet-wasm/esm/parquet_wasm.js:3695
            wasm.__wbindgen_add_to_stack_pointer(16);
                 ^

TypeError: Cannot read properties of undefined (reading '__wbindgen_add_to_stack_pointer')
    at Table.fromIPCStream (file:///Users/seva/seva/node_modules/parquet-wasm/esm/parquet_wasm.js:3695:18)
    at file:///Users/seva/seva/boosters/check.js:23:33
    at ModuleJob.run (node:internal/modules/esm/module_job:218:25)
    at async ModuleLoader.import (node:internal/modules/esm/loader:329:24)
    at async loadESM (node:internal/process/esm_loader:28:7)
    at async handleMainPromise (node:internal/modules/run_main:120:12)
@fspoettel
Copy link
Contributor

fspoettel commented Feb 28, 2024

seems related to #412

@kylebarron
Copy link
Owner

Can you say what you tried? Usually an error like

Cannot read properties of undefined (reading '__wbindgen_add_to_stack_pointer')

means that you didn't initialize the Wasm bundle. If you're using the esm endpoint, you need to await the default export, otherwise the Wasm bundle will never get initialized.

@vaguue
Copy link
Author

vaguue commented Mar 5, 2024

Tried to run it with node v21.5.0 in esm mode (with "type": "module")

@kylebarron
Copy link
Owner

In esm mode, you always have to await the default export, or you'll get errors like above where the wasm wasn't instantiated

@fspoettel
Copy link
Contributor

@kylebarron Would you accept a PR that updates the documentation? I also ran into this exact issue when integrating parquet-wasm into an ESM web worker (next.js). I think this would be very helpful given that e.g. also vite defaults to esm modules with v5.

@vaguue
Copy link
Author

vaguue commented Mar 5, 2024

Isn't esm async out-of-box? I always thought that the whole meaning of esm is the possibility to export somewhat asynchronous, yet I have to do await somethingImported? Kinda counterintuitive

@kylebarron
Copy link
Owner

Would you accept a PR that updates the documentation?

Yes of course! PRs always welcome

Isn't esm async out-of-box?

It is but wasm initialization is a separate async step from just loading the code itself.

Ideally we can fix #414 and then publish an 0.6 release sometime soon, but I haven't had time to test that.

@vaguue
Copy link
Author

vaguue commented Mar 6, 2024

Well, can't wait for this to happen, as of now I had to use apache arrow + node-addon-api, to it would be nice to have a stable API for working with parquets. What we gonna do with this issue?

@kylebarron
Copy link
Owner

In the documentation, it says

Note that when using the esm bundles, the default export must be awaited. See here for an example.

It's not clear to me what your issue is. You need to await the default export and then it'll work.

@vaguue
Copy link
Author

vaguue commented Mar 6, 2024

well, can we just create an export in which we await this default export and reexport the actual module?

@vaguue
Copy link
Author

vaguue commented Mar 6, 2024

So we can just import the module and be ready to go. Because generally this await init(); thing is kinda dubious for me.

@kylebarron
Copy link
Owner

well, can we just create an export in which we await this default export and reexport the actual module?

No, as far as I can tell that's not possible. And even if it were, I'd have to somehow modify the default JS binding that wasm-bindgen emits, which sounds horrible.

thing is kinda dubious for me

How is this dubious?

import initWasm, {readParquet} from 'parquet-wasm/esm/arrow1.js';
await initWasm();
readParquet(...);

FWIW sql.js has the same behavior, which they call initSqlJs, so I'm not alone.

A PR is welcome to improve the docs! But otherwise I'm going to close this because it's expected behavior.

@kylebarron kylebarron closed this as not planned Won't fix, can't repro, duplicate, stale Mar 6, 2024
@vaguue
Copy link
Author

vaguue commented Mar 6, 2024

what about doing like
myexport.js:

import initWasm, * as MyExports from 'parquet-wasm/esm/arrow1.js';
await initWasm();
export * from MyExports;

@vaguue
Copy link
Author

vaguue commented Mar 6, 2024

what I mean is why not just create a wrapper around the default wasm-bindgen intricacies to make the usage more simple :)
I don't know how wasm-bindgen guys see things, but in my opinion that's kinda against the ESM nature at all.
Not that I see a possible case when someone imports the module but doesn't await for this init thing.

@kylebarron
Copy link
Owner

The wasm bundle is not fetched until the initWasm call. Therefore, separating it gives a lot more power to users. For example, you might only rarely fetch Parquet files from your app, and therefore wish to defer loading the wasm until the end user needs the functionality.

Additionally, you can pass a URL into initWasm to fetch the wasm from your own server, which can be necessary in some situations.

@vaguue
Copy link
Author

vaguue commented Mar 7, 2024

Correct me if I'm wrong, but in this case one can just import the whole module asynchronously,i.e. await import(...). So you have this "power" even without the init step. But this step overcomplicates Node.js usage.

@vaguue
Copy link
Author

vaguue commented Mar 14, 2024

import * as arrow from "apache-arrow";
import init, * as parquet from "parquet-wasm";

await init();

// Create Arrow Table in JS
const LENGTH = 2000;
const rainAmounts = Float32Array.from({ length: LENGTH }, () =>
  Number((Math.random() * 20).toFixed(1))
);

const rainDates = Array.from(
  { length: LENGTH },
  (_, i) => new Date(Date.now() - 1000 * 60 * 60 * 24 * i)
);

const rainfall = arrow.tableFromArrays({
  precipitation: rainAmounts,
  date: rainDates,
});

// Write Arrow Table to Parquet

// wasmTable is an Arrow table in WebAssembly memory
const wasmTable = parquet.Table.fromIPCStream(arrow.tableToIPC(rainfall, "stream"));
const writerProperties = new parquet.WriterPropertiesBuilder()
  .setCompression(parquet.Compression.ZSTD)
  .build();
const parquetUint8Array = parquet.writeParquet(wasmTable, writerProperties);

// Read Parquet buffer back to Arrow Table
// arrowWasmTable is an Arrow table in WebAssembly memory
const arrowWasmTable = parquet.readParquet(parquetUint8Array);

// table is now an Arrow table in JS memory
const table = arrow.tableFromIPC(arrowWasmTable.intoIPCStream());
console.log(table.schema.toString());
// Schema<{ 0: precipitation: Float32, 1: date: Date64<MILLISECOND> }>
node:internal/deps/undici/undici:12442
    Error.captureStackTrace(err, this);
          ^

TypeError: fetch failed
    at node:internal/deps/undici/undici:12442:11
    at async __wbg_init (file:///Users/seva/seva/node_modules/parquet-wasm/esm/parquet_wasm.js:5238:51)
    at async file:///Users/seva/seva/boosters/check.js:4:1 {
  cause: Error: not implemented... yet...
      at makeNetworkError (node:internal/deps/undici/undici:5675:35)
      at schemeFetch (node:internal/deps/undici/undici:10563:34)
      at node:internal/deps/undici/undici:10440:26
      at mainFetch (node:internal/deps/undici/undici:10459:11)
      at fetching (node:internal/deps/undici/undici:10407:7)
      at fetch (node:internal/deps/undici/undici:10271:20)
      at Object.fetch (node:internal/deps/undici/undici:12441:10)
      at fetch (node:internal/process/pre_execution:336:27)
      at __wbg_init (file:///Users/seva/seva/node_modules/parquet-wasm/esm/parquet_wasm.js:5233:17)
      at file:///Users/seva/seva/boosters/check.js:4:7
}

Node.js v21.5.0

This is just terrible

@kylebarron
Copy link
Owner

If you're in node, use the node export

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants