Abstract fread/write_big into f_big #398

zlatanvasovic · 2020-06-18T15:47:46Z

No description provided.

weshinsley · 2020-06-19T15:59:05Z

Well... I see the point - reducing line numbers and sharing the code...

But conceptually, I'm not quite sure whether we want to merge reading and writing like this. Taking a step back, I can imagine there being ways of making some file reading faster that might not apply to file writing. (See Issue #223 which didn't quite make it into workable code, but shows the kind of read-specific sort of work I'm thinking of).

If we go with it, then I'd prefer the parameter to f_big to be a const containing READ | WRITE (rather than an arbitrary integer 0 or 1). But I'd like more thoughts from others about whether this is the right sort of refactor on this code, or whether a different sort of rewrite is really needed, which would be (slightly) harder to unpick if we merged them. I suspect there's a better more modern answer for both the reads and writes currently being done.

zlatanvasovic · 2020-06-19T16:15:47Z

@weshinsley Even if some change is to be implemented specifically in read or write, most of the code still can be easily abstracted.
It can be done much more idiomatically, but C++ lacks many metaprogramming features, such as code evaluation from a string. About READ | WRITE, any form of parameter you prefer will work. I always allow edits by maintainers, so you can edit it before merging.

DavidVernest · 2020-06-19T16:23:11Z

We are merging code into a live system - having the maintainers modify the logic is not good practice. It is the developer's responsibility to deliver a cohesive solution before asking for the merge.

zlatanvasovic · 2020-06-19T18:14:24Z

@mstevens-uk I've already implemented the cohesive solution which has all the required logic, as you could have noticed. It's also far more cohesive than the previous solution.

However, what I cannot do is try to change the preference of maintainers for specific variable names or forms. For me having a variable with possible values of 0 and 1 or "READ" and "WRITE" is essentially the same. And as @weshinsley has said, it's just their own preference, not a must.

matt-gretton-dann · 2020-06-20T13:26:58Z

The code as it stands needs reworking to cope with errors from fwrite() & fread() - including those that may not be fatal and just indicate a retry. As it is I think if the disk gets full this code will go into an infinite loop when writing to disk.

My question is why can't we just use plain fread/fwrite what do these functions gain us? (Is it simply some Operating Systems perform better if you make the elements as large as possible compared to the number you are outputting?).

So I don't like this PR. But the functions do need reworking - if only to have better error handling.

(And on the 0 or 1 or enum question: I'd prefer a lambda which did the read or write).

weshinsley · 2020-06-20T21:24:04Z

Yep, so our original code looks like it is forcing a chunk size of 2Gb for reads/writes, so I too wonder if this overides some defaults (either OS/compiler/some combination) for the chunk size that fread/fwrite use. I can imagine Neil testing this over infiniband with some much larger files than we are used to - full US, or perhaps all of Africa, which we have run at times.

So this PR is just factoring out the code that's common to both the fread_big and fwrite_big functions, since they only differ in two places - f[read] or f[write]. But, if/when error handling is added, will that motivate us back to wanting separate read/write functions again? Presumably there are different potential errors for fread vs fwrite...?

matt-gretton-dann · 2020-06-22T09:03:30Z

My ideal would be to replace the f*_big functions with xf* functions which wrap the f* functions handling errors and smoothing over platform issues (similar to the x*alloc I've suggested elsewhere: #406).

On UNIX platforms the errors in particular we should be handling are EAGAIN & EINTR which are indicators that the call got interrupted - not that things went wrong. On Windows whilst the above are valid codes I don't think they are used - instead we should consider an invalid parameter handler.

Abstract fread/write_big into f_big

3b4fec5

zlatanvasovic requested review from dlaydon, gnedjati, igfoo, matt-gretton-dann, NeilFerguson and Z80coder as code owners June 18, 2020 15:47

zlatanvasovic closed this Jul 17, 2020

zlatanvasovic deleted the freadwrite branch July 17, 2020 20:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abstract fread/write_big into f_big #398

Abstract fread/write_big into f_big #398

zlatanvasovic commented Jun 18, 2020

weshinsley commented Jun 19, 2020

zlatanvasovic commented Jun 19, 2020

DavidVernest commented Jun 19, 2020

zlatanvasovic commented Jun 19, 2020

matt-gretton-dann commented Jun 20, 2020

weshinsley commented Jun 20, 2020

matt-gretton-dann commented Jun 22, 2020

Abstract fread/write_big into f_big #398

Abstract fread/write_big into f_big #398

Conversation

zlatanvasovic commented Jun 18, 2020

weshinsley commented Jun 19, 2020

zlatanvasovic commented Jun 19, 2020

DavidVernest commented Jun 19, 2020

zlatanvasovic commented Jun 19, 2020

matt-gretton-dann commented Jun 20, 2020

weshinsley commented Jun 20, 2020

matt-gretton-dann commented Jun 22, 2020