Description
Use case
Sometimes, a model can be iteratively trained on multiple workers (e.g., using mutliprocessing). In that case, there is an tfevent_*
file for each of the workers. So when the model is trained on worker_1
, the events are persisted in the tfevent_1
file. Then, the model training can be dedicated to a different worker (worker_2
) and the events are persisted in the tfevent_2
file. For next iterations, the model can be trained on worker_1
again, which means that the tfevent_1
is used.
The issue is that the we end up having tfevent
files that contain events in non-chronological order:
tfevent_1
file:
- event 1
- event 3
tfevent_2
file:
- event 2
And rustboard
fails to read them properly.
This is not a bug since this is a documented behaviour:
Feature request
Allow rustboard
to parse data in partially-chronological order.