Skip to content

Permit namelist character delimiters to default to quotes #67

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
marshallward opened this issue Nov 3, 2019 · 27 comments
Open

Permit namelist character delimiters to default to quotes #67

marshallward opened this issue Nov 3, 2019 · 27 comments
Labels
Clause 8 Standard Clause 8: Attribute declarations and specifications

Comments

@marshallward
Copy link
Contributor

Currently there is a somewhat paradoxical issue related to namelist support, where a namelist produced by WRITE() and using the default DELIM value of None does not conform to the namelist specification when it contains a character array.

A namelist as described in F2018 requires that all character arrays be delimited with single or double quotes (13.11.3.3 p7):

When the next effective item is of type character, the input form consists of a sequence of zero or more rep-chars whose kind type parameter is implied by the kind of the corresponding list item, delimited by apostrophes or quotes.

However, the default value of DELIM=None will produce a namelist without delimiters (i.e. space-delimited). From 13.11.4.2 p1:

Values in namelist output records are edited as for list-directed output (13.10.4).

If we go back to 13.10.4, it highlights three forms of character array output based on DELIM, with NONE being the default. In other words, the default is incompatible with the namelist format.

Note 1 of 13.11.4.2 clarifies this point:

Namelist output records produced with a DELIM= specifier with a value of NONE and which contain acharacter sequence might not be acceptable as namelist input records.

In practice, this is often not a problem, and a robust parser can usually resolve the non-delimited strings if whitespace can act as a delimiter. But it becomes a more problematic, if not impossible, if the string contains a lexical token, such as &, =, or /. For example, strings containing paths are almost guaranteed to cause problems.

To summarize, a namelist containing character arrays when written with DELIM=NONE does not conform to the namelist specification, and a compiler can not read its own namelist if DELIM is not set to either QUOTE or APOSTROPHE.

Currently, gfortran ignores this requirement and defaults to using quote (") delimiters. Intel Fortran does not, and will produce a namelist that it cannot read.


While this particular problem can be avoided by requiring the programmer to use DELIM for all namelist output (e.g. DELIM='QUOTE'), it will inevitably cause problems for the less experience developer, and could lead to output with errors.

I'd like to propose that this problem be address in some way. Two possible solutions:

  • Require that that DELIM be set for a namelist input, and DELIM='QUOTE' or DELIM=APOSTROPHE) when writing a namelist. If WRITEis called with aNMLargument and withoutDELIM`, then it is an error.

  • Follow GFortran and silently use DELIM='QUOTE' (or APOSTROPHE) when the output is a namelist.

There may be others possible solutions.

This is not a problem that needs to be solved, since users could just be trained to use DELIM when writing namelists. But I believe that this change would be an improvement in usability and would help to avoid future errors.


This was prompted by the following discussion on the Intel forums:

https://software.intel.com/en-us/forums/intel-fortran-compiler/topic/831685#comment-1947391

@marshallward marshallward changed the title Permit namelist character delimiters to defaul to quotes Permit namelist character delimiters to default to quotes Nov 3, 2019
@FortranFan
Copy link
Member

FortranFan commented Nov 3, 2019

@marshallward, I totally agree.

I personally find NAMELIST a very useful facility in Fortran even if it is generally underutilized and maligned at times. I also think the support toward NAMELIST in derived-type IO (so happy for that!) can also help with user-friendly serialization and deserialization of object instances of 'classes' (derived types) in Fortran, so anything that nourishes or otherwise keeps fresh the NAMELIST facility is a good thing in my opinion.

Your proposal looks valuable, it should be relatively easy to add to the standard if there is will for the same.

@certik
Copy link
Member

certik commented Nov 3, 2019

@marshallward thanks a lot for reporting this. I agree, I think we should fix it. Let's try to write a proposal for this and let's get the committee to consider this at the next meeting.

@marshallward
Copy link
Contributor Author

Thanks for the positive feedback, @FortranFan @certik . My impression from past interactions with J3 members is that there is little interest in doing anything related to namelists, so this is encouraging to hear.

I'll defer to more experienced members here on how to properly write and submit a proposal.

@milancurcic
Copy link
Member

I'll defer to more experienced members here on how to properly write and submit a proposal.

Please, please, at least start a draft (submit via PR). You already have substantial material in this original post of this thread. It will significantly lower the bar for others to add contributions to the proposal once there's something concrete there. See a current early proposal draft as an example:

https://github.com/j3-fortran/fortran_proposals/blob/master/proposals/namespace_modules/19-246.txt

I will put an effort soon toward a proposal template which will help streamline the bootstrapping of a proposal.

@marshallward
Copy link
Contributor Author

Sure, in that case I'm happy to put something together. I'll look over some of the proposals and will ping for feedback when it's ready.

@certik
Copy link
Member

certik commented Nov 4, 2019

Thanks @marshallward. I would like the wide community to create proposals, and then the J3 members can help refine it and provide feedback. Such a process scales well with people.

@marshallward
Copy link
Contributor Author

I've drafted most of a proposal here:

https://github.com/marshallward/fortran_proposals/blob/namelist_delim/proposals/namelist_delimiters/namelist_proposal.rst

I am not entirely sure how it is supposed to look, and I did deviate a bit from the example. For example, I don't have a "use cases" section, since I was suggesting a different interpretation of existing rules, rather than a new feature. I also would expect that the final section is to be removed, I mostly left those as my own personal notes.

I can either send it as a PR or suggestions can be made on my local branch, whatever is easier.

@certik
Copy link
Member

certik commented Nov 18, 2019

@marshallward thanks! Send a PR please to this repository.

I would like to have a template here about what a proposal should look like so that the committee can consider it. In the meantime, create a PR and let's discuss it.

@marshallward
Copy link
Contributor Author

I've created a PR here: #94

If you'd like to close this issue and move discussion there, that's fine with me.

@zjibben
Copy link
Member

zjibben commented Feb 27, 2020

I shared @marshallward 's #154 proposal with the J3/JoR subgroup. Dan gave me very detailed feedback summarizing their conclusions:

  1. Changing the default behavior of namelist will change the behavior of existing programs. This may be done only for a very important reason, and a convenience issue, such as a default setting, likely does not rise to that level.

  2. The default will also affect list-directed transfers, which is likely unintended, and will be surprising to list-directed users. This also argues that a very good reason will be required to overcome the desire not to break programs.

Note that one may set the delim value via the OPEN statement or the specific transfer (read/write print) statement in question. The ease of setting the desired behavior per connection or per transfer argues that we not proceed with a change like this one. As a changeable mode, delim may be set and reset as needed via the initial OPEN and (re)OPEN statements.

Also, we have added several features included in the last several revisions of the standard to help input/output of csv records. Specifically, the unlimited format item allows writing "item [separator item] ..." style records. Including a g data edit descriptor in the unlimited format item can treat different types (perhaps when a derived type has had a new component added, for example). The new f202x SPLIT intrinsic operates similarly to strtok(), and helps decode csv records read into a (possibly long) character entity.

These are my interpretation of JoR's conclusions and not J3's, nor WG5's. I say 'likely' above because it is a committee decision, and I do not know with certainty what the outcome of votes would be.

@marshallward
Copy link
Contributor Author

marshallward commented Feb 27, 2020

Thanks very much for providing the feedback.

I'm sympathetic to point 1, both options would change default behavior, and perhaps that is an automatic veto.

I think my only response would be to point 2. I deliberately did not intend for list-directed output to be affected here, and that DELIM would default to different values depending on whether or not NML= has been set in a WRITE() statement. If the NML= is absent, and the output is list-directed, then the default DELIM from OPEN() (or perhaps WRITE()) would be respected.

So I feel that there ought to be a solution which does not affect list-directed input. In fact, given that GFortran is already doing this, it is proof that there is a way to implement this which does not affect list-directed output.

In other words, I am requesting that the non-compliant GFortran behavior become the standard behavior.

Not sure if there's any constructive way to relay this feedback, so perhaps I am just talking to the air. But thanks very much for presenting the proposal and sending the feedback, I really do appreciate it.

@certik
Copy link
Member

certik commented Feb 27, 2020

@marshallward is your new proposal (make GFortran behavior the default) different to what you submitted above? If so, let's write a new proposal and we will present it at the next meeting.

@marshallward
Copy link
Contributor Author

These were the two proposals:

A. If namelist-group-name appears, then a DELIM= specifier with the
value of either APOSTROPHE or QUOTE shall also appear.

B. If namelist-group-name appears and a DELIM= specifier has the value
of NONE, then this value is ignored and the data transfer uses a
value of APOSTROPHE.

Seems pretty explicit that the change would only apply if NML= appears.

@zjibben
Copy link
Member

zjibben commented Feb 28, 2020

You are very welcome, I'm happy to help! Our response from the committee has overall been very positive, I think they are happy to have this venue to interact with the programmer community. As @certik said, if we work on a new proposal we can present it at the next meeting.

@marshallward
Copy link
Contributor Author

Oh, I may not have addressed your question @certik . Yes, AFAIK option (B) does describe default behavior in GFortran (perhaps using QUOTE rather than APOSTROPHE).

@certik
Copy link
Member

certik commented Feb 28, 2020

@marshallward ok, then let's write a proposal just for B, incorporating any feedback so far, and we can present it at plenary next time.

@klausler
Copy link

Seems pretty explicit that the change would only apply if NML= appears.

Well, a namelist-group-name would have to appear; the NML= specifier is (very unfortunately!) optional.

(If NML= were always required, NAMELIST group names could inhabit their own namespace.)

@marshallward
Copy link
Contributor Author

Thank you @klausler , in that case I am mistaken about the NML= part.

But given that the interpreter is capable of identifying the argument as a namelist-group-name, should it also not be possible to identify the WRITE() operation as a namelist output and therefore not list-directed output? If so, then I think that the current wording might be OK to leave list-directed output unaffected?

@klausler
Copy link

Thank you @klausler , in that case I am mistaken about the NML= part.

But given that the interpreter is capable of identifying the argument as a namelist-group-name, should it also not be possible to identify the WRITE() operation as a namelist output and therefore not list-directed output? If so, then I think that the current wording might be OK to leave list-directed output unaffected?

I haven't thought much about your conclusion, but the first part of your statement is correct; it is the case that the compiler can distinguish a namelist write ([NML=]group) from a list-directed write([FMT=]*).

@marshallward
Copy link
Contributor Author

I've updated the proposal here:

#159

It removes the first option and makes explicit that there is no intention to change list-directed output. I also (hopefully) cleaned up and clarified the problem in the introduction.

@marshallward
Copy link
Contributor Author

I would also like to add a rebuttal to point 1. I don't consider this simply a matter of convenience. I see it as a remedy to a real problem, which is that the default namelist output is not a namelist and is potentially unparseable.

Perhaps it is considered a convenience simply because it can be avoided by a careful programmer who judiciously uses DELIM for every unit destined to produce a namelist. But I do consider it a concern when two very simple commands:

open(5, file=some_path)
write(5, nml=some_nml) my_data

produces a thing which is almost - but not quite - a namelist. So in my mind, this is an important reason for changing the default behavior.

Again, I don't really know if there's any way to pass along these comments. I suppose I'm just adding it here for the sake of discussion.

@klausler
Copy link

klausler commented Apr 7, 2021

Suggestion: add a new OPEN/INQUIRE argument with a distinct name, say NAMELIST_DELIM=, that would be independent of DELIM= and allow you get what you want for a whole run with just one statement.

@marshallward
Copy link
Contributor Author

marshallward commented May 26, 2021

@klausler Sorry for not replying to this at the time, I think the idea of a separate namelist delimiter argument is a very good compromise.

  • list-directed IO is left unchanged
  • In most cases, the argument would be unneeded
  • the default could be left unspecified, to the discretion of the compiler.

When I find some time, I'll revise the proposal to recommend this.

@certik certik added the Clause 8 Standard Clause 8: Attribute declarations and specifications label Apr 23, 2022
@vansnyder
Copy link

A related problem is that namelist and list-directed input are less similar than they could be. In list-directed input, a character datum that does not contain a blank need not be surrounded by quotes or apostrophes. In namelist, it's always required.

Here's a weird (but handy) use case.

Put what looks like namelist input on your command line (without the namelist name) -- name=value pairs.

Read the entire command line (not individual command arguments) into a character variable, after putting the slash and namelist name at the beginning. Add a slash at the end. Read the variable using namelist input. Now you have all your command-line arguments in one gulp, and it's really easy to change them. But if you have a character variable in the namelist, you need to put it into quotes or apostrophes. But command shells treat quotes and apostrophes specially, so you need to escape them. Ugh.

@tclune
Copy link
Member

tclune commented Apr 6, 2023

I'm looking at this issue belatedly. When I read the introductory text at the top of the thread (and Marshall's private email to me), I though that there was actually an inconsistency in the standard. But now, reading more closely, it appears that this is more of a feature request, as the standard already acknowledges the issue in Note 1 of 13.11.4.2.

If anyone here thinks this is an INTERP instead of a feature request, please tell me.

@marshallward
Copy link
Contributor Author

@tclune From my novice perspective:

The inconsistency is that the namelist must have either QUOTE or APOSTROPHE as its DELIM (cited above), but is permitted to be opened with DELIM=NONE, which is the default.

The output is therefore a thing which is not a namelist, since it has space-delimited strings.

I don't know where the line is between INTERP and feature request. I also agree that the standard acknowledges the situation and urges users to set DELIM. But it seems a bit off when the optional DELIM is more or less required for all practical purposes.

@tclune
Copy link
Member

tclune commented Apr 6, 2023

OK - I've fired off a brief summary and a link to this thread to the interp subgroup. My guess is that they (he) will say this is not an interp.

At that point I can post to the J3 group that this issue exists and we can see if it resonates with anyone. But if the relevant subgroup (JoR) did not take it up before, I don't know what will have changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clause 8 Standard Clause 8: Attribute declarations and specifications
Projects
None yet
Development

No branches or pull requests

8 participants