You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So if you're reading a single CSV and perform partitionby the first part will be a single partition.
You can either repartition using the hints in the link or create a multi config job with a CSV repartitioned to parquet (using the repartition flag in the parquet output) and then reading and running your business logic on top of it
i read csv input then write parquet with partitionby, it takes a long time. any settings u recommend ? maybe https://issues.apache.org/jira/browse/SPARK-24940 ?
The text was updated successfully, but these errors were encountered: