-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stdlib_experimental_io(open): support for unformatted sequential files #86
Conversation
Possibilities for:
Now supported by
What about the |
Couldn't we cover all reading and writing with only stream access? Do we need sequential and direct access? I thought they are edge cases covered under the more general stream access. |
Personally I don't use "sequential + unformatted" and "direct + unformatted". But they seem to be used by some people: https://github.com/fortran-lang/stdlib/wiki/Usage-of-%22open%22 So I think it would be good to at least support "sequential+unformatted" (this PR). With these 3 options ( Note: a sequential unformatted file can be read as a stream unformatted file if the specificities of a sequential unformtted file are considered when it is read. Not sure about a direct unformatted file. |
I use unformatted sometimes --- the advantage is that it allows to quickly save large arrays from a simulation, that can be later post-processed by another Fortran code (compiled with the same compiler of course). The stream might be similarly fast (I don't know if it's as fast as unformatted on all platforms). I agree we should support text ( I personally would not designate both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it looks great. Thanks!
+1 to merge this PR.
They exist a non-standard |
Let's keep both for now, as we are in experimental. Let's get some experience using it and we can revisit later.
…On Sun, Jan 5, 2020, at 1:15 PM, Jeremie Vandenplas wrote:
>
> I personally would not designate both `b` and `s` for binary stream. I would only use `b`, as in Python. @jvdp1 <https://github.com/jvdp1> in your opinion, what is the advantage of allowing two characters `s` and `b` to do exactly the same?
They exist a non-standard `form=binary`. So, using `b` for `stream` may
be confusing. Mentioning both may clarify that `b` is used for
unformatted stream files.
If people disagree with that, I can remove the `s`. Or we can keep it,
and not advertise it. I will not be difficult with that.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#86?email_source=notifications&email_token=AAAFAWAEYTOQWLNUML576I3Q4I5VLA5CNFSM4KC4WPQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEID6QUY#issuecomment-570943571>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAFAWG4X66MQGAJWCCNPZDQ4I5VLANCNFSM4KC4WPQA>.
|
@milancurcic Is it fine to keep (and merge) it as it is now implemented? |
I think this API is problematic. Will write in more detail tonight. |
Here's the problem in my view: This PR mixes up form (text/formatted or binary/unformatted) and access ( form is important for the user and should be part of the API. When you want text, you use Read/write/readwrite is also important for the user and should be part of the API. When you want to read- or write-only, you use access however merely specifies how you're reading or writing under the hood. Part of the reason why Fortran's I/O is so complicated is because user has to choose access also. And the access that you choose changes how
Unformatted == binary. We don't need My main point being, sequential or direct access modes, while useful, are specific ways of reading and writing that can more generally be done by I understand and agree that these are useful and there are projects using them. However, I don't think anybody's gonna take somebody else binary files written in sequential mode and try to read them using stdlib (and if they do, they can read it as long as they know what the records mean). We should aim to design a clean API with one recommended way of reading and writing. I suggest that we take this to the drawing board in #14 and sketch out the API that we want, and beyond just |
Designing the API for |
@milancurcic if I understood what you wrote, you are proposing to keep the To be honest, the various combinations are quite complicated that I only use the As long as we naturally capture 95% of all use cases, then I think that's good enough. However, when you use In other words, this Python like API does not directly map to the "form" / "access" fields in Fortran. Rather, the idea that I had was to pick such combinations of "form", "access" and other parameters, so that the result is pretty much what you would expect when coming from Python. So there would always be combinations of |
Let's discuss some particular example. Using the current master: character(:), allocatable :: filename
integer :: u, a(3)
! Test mode "w"
u = open(filename, "w")
write(u, *) 1, 2, 3
close(u)
! Test mode "r"
u = open(filename, "r")
read(u, *) a
call assert(all(a == [1, 2, 3]))
close(u) the second If you instead opened in formatted and stream, what would have to change in the above code to read the array "a" properly? What exactly is the difference between formatted/sequential and formatted/stream? |
@certik Exactly! In the API, expose what maps to Python's API, which is what we already have. Regarding the internal implementation, I suggest that we always open with |
I don't know the answer and I'll need to play with it. |
For text ( Sequential version: integer :: u
integer :: a(3) = [1, 2, 3]
integer :: b(3) = 0
open(newunit=u, file='somefile.txt', status='unknown', &
action='write', access='sequential', form='formatted')
write(u, *) a
close(u)
open(newunit=u, file='somefile.txt', status='old', &
action='read', access='sequential', form='formatted')
read(u, *) b
close(u)
print *, all(a == b)
end Stream version integer :: u
integer :: a(3) = [1, 2, 3]
integer :: b(3) = 0
open(newunit=u, file='somefile.txt', status='unknown', &
action='write', access='stream', form='formatted')
write(u, *) a
close(u)
open(newunit=u, file='somefile.txt', status='old', &
action='read', access='stream', form='formatted')
read(u, *) b
close(u)
print *, all(a == b)
end |
I tried this patch: diff --git a/src/stdlib_experimental_io.f90 b/src/stdlib_experimental_io.f90
index f6e4a50..b3a115c 100644
--- a/src/stdlib_experimental_io.f90
+++ b/src/stdlib_experimental_io.f90
@@ -332,7 +332,7 @@ end select
select case (mode_(3:3))
case('t')
- access_='sequential'
+ access_='stream'
form_='formatted'
case('b', 's')
access_='stream' and I can't see any difference... Tests still pass, etc. So maybe we can just use Then for codes that use other "access", such as sequential, they will continue using the built-in |
I'll experiment some more. This is a simple case. I'm curious if In |
Well, I was really hoping that our For the OO interface, there you don't have to be compatible with any built-ins. So one can indeed design it in any way you like. Then we can provide all the necessary functions in the low level API. This |
You're still compatible with built-in read and write statements. They'd just be reading/writing in stream mode rather than sequential. I don't argue for stream because I love it, but rather because I think sequential access makes for more awkward behavior and API. |
Let's gain some experience with this, I need to see the details. |
I see @milancurcic 's point. Regarding unformatted files, I don't see what are the advantages of Regarding formatted files, I will only consider Here is some explanations by @zbeekman. So my proposition is to close this PR, and to possibly change Then we can open a new issue (or in #14) to discuss if we want to support
where
|
I agree. Let's use |
I will open a PR to modify
Should we discuss this API in #14? Or implementing it and opening a PR? What would be the best strategy such that many people can discuss it? |
I just did in #90, sorry about that.
I would send a PR, so that we can discuss the actual code and an API, and we can comment at #14 to discuss this at the PR. |
I will start on it, if ok for you |
Yes, thank you! |
I think this is okay. I'm still skeptical that |
See #91 for discussion and implementation of |
Addition of a support for opening unformatted sequential files