-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
validate-parlamint speedup #846
Comments
@TomazErjavec do we insist on this order? |
I actually don't think jing is the bottleneck, rather, it is the XSLT validation that is slow. Also, validate-parlamint.pl takes file one by one, so it would be difficult to just do jing in parallel. In short, I don't think its worth trying to give jing multiple files. |
I have tried it, and |
ok, but I still think it is not worth it given the other problems with this approach. This might save 10% processing time, if that.
Huh? |
Ok, I have staged my changes. Another space for speeding up is the link-checker: Transform |
Yes, I think very small - the complete teiHeader (with everything XIncluded) fits into memory of any computer strong enough to process the corpus. |
I have been exploring why the validation is so slow.
jing
jing allows to validation of multiple files with the same schema in parallel. These are the speeds for 64 thread CPU, in seconds:
We can speed up jing 5 times, but the order of output will be different - not file by file.
The text was updated successfully, but these errors were encountered: