About the final result #105

Jdemon233 · 2024-02-22T04:34:47Z

Hi,I have finished all pipeline, but i finally get the result a few .parquet files and don't find where my processed text data

mauriceweber · 2024-02-22T09:45:18Z

Hi @Jdemon233 ,

Can you provide some more context on the steps you have run and the params you used? The parquet files are produced by the deduplication scripts which take as input the zipped jsonl files which contain the text documents.

Jdemon233 · 2024-02-23T10:38:24Z

Hi @Jdemon233 ,

Can you provide some more context on the steps you have run and the params you used? The parquet files are produced by the deduplication scripts which take as input the zipped jsonl files which contain the text documents.

thanks, i resolved my problem,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the final result #105

About the final result #105

Jdemon233 commented Feb 22, 2024

mauriceweber commented Feb 22, 2024

Jdemon233 commented Feb 23, 2024

About the final result #105

About the final result #105

Comments

Jdemon233 commented Feb 22, 2024

mauriceweber commented Feb 22, 2024

Jdemon233 commented Feb 23, 2024