You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When converting DISPATCHLOAD files to parquet, the process often dies due to running out of memory. This can result in an empty 0 byte parquet file, or other partially-written files. Even if the user reruns nemosis, the library sees a file exists, and doesn't check that it's a valid file, so the error isn't picked up until later when we try to read the file.
This can be prevented by wrapping writes like:
try:
df.write_parquet(path)
except Exception:
if os.path.exists(path):
os.remove(path)
raise
(True of CSV, feather etc)
This can also happen if the process is terminated for some other reason. (e.g. user clicks stop in jupyter.)
The text was updated successfully, but these errors were encountered:
Hmm, actually for an out of memory error specifically, the process may be killed before python can gracefully throw an exception. But I still think it's worth adding this because some halting conditions may result in corrupt files being cleaned up.
When converting DISPATCHLOAD files to parquet, the process often dies due to running out of memory. This can result in an empty 0 byte parquet file, or other partially-written files. Even if the user reruns nemosis, the library sees a file exists, and doesn't check that it's a valid file, so the error isn't picked up until later when we try to read the file.
This can be prevented by wrapping writes like:
(True of CSV, feather etc)
This can also happen if the process is terminated for some other reason. (e.g. user clicks stop in jupyter.)
The text was updated successfully, but these errors were encountered: