Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional format specification languages #340

Open
awvwgk opened this issue Mar 13, 2021 · 10 comments
Open

Additional format specification languages #340

awvwgk opened this issue Mar 13, 2021 · 10 comments
Labels
idea Proposition of an idea and opening an issue to discuss it topic: utilities containers, strings, files, OS/environment integration, unit testing, assertions, logging, ...

Comments

@awvwgk
Copy link
Member

awvwgk commented Mar 13, 2021

Formatting and pretty printing output is a usual task in any programming language. The Fortran format specifiers are a bit special because they are quite unique compared with other programming languages with C-like or Python-like format specifier. While Fortran's format specifiers are probably older than the C-like or Python-like format specifiers the later found wide adoption across several programming languages.

This raises the question if stdlib could offer a stdlib_format module to allow formatting of strings using C-like and/or Python-like format specifiers.

Original post by @ivan-pi in #337 (comment):

Should the user formatting re-use the Fortran formatting conventions or do we want to adopt something like the Format Specification Mini-Language in Python?

Personally I would be in favor of having a function called format (like the Python .format() or the C++ 20 std::format) for formatted string conversion, but I'm not sure we can really get there in standard Fortran. Probably we would need to limit the number of arguments and use the class(*) approach like in M_msg.

@awvwgk awvwgk added topic: utilities containers, strings, files, OS/environment integration, unit testing, assertions, logging, ... idea Proposition of an idea and opening an issue to discuss it labels Mar 13, 2021
@epagone
Copy link

epagone commented Mar 13, 2021

I'm not sure about this. I think that format specification in Fortran is different but works quite well. What I felt Fortran really needed in the past was the g0 descriptor that (luckily) we now have from a few years.

I might be missing something, though. Is there a "feature" advantage in adopting C or Python like format specifiers (i.e. something that Fortran cannot do or does in an inconvenient way)?

@milancurcic
Copy link
Member

Related: #19

I like and support this idea (whether C- or Python-style, or both). I agree that Fortran formatting works well, but alternative format specifications could be helpful to newcomers who are familiar with some other language.

@ivan-pi
Copy link
Member

ivan-pi commented Mar 14, 2021

I think the easiest way to enjoy the best of both worlds is to have a converter function, e.g. given the following C format string

"Color %s, Number %d, Float %4.2f"

it should produce the Fortran equivalent

"('Color ',A,', Number ',I0,', Float ',F4.2)"

This way you could easily inline it:

character(len=5) :: str = "Red"
integer :: i = 3
real :: a = 42.0
write(*,cfmt("Color %s, Number %d, Float %4.2f")) str, i, a

Addendum: I imagine regex would be the way to do this by searching for patterns of % and the format letters, and then replacing them with Fortran specifiers. Care needs to be taken of escape characters.

@epagone
Copy link

epagone commented Mar 15, 2021

Thinking more about it, probably there is also the advantage that C/C++/Python format specification is more compact and avoids the (potentially annoying) typical combination of single and double quotes. +1 also for me.

@ivan-pi
Copy link
Member

ivan-pi commented Mar 16, 2021

I see now that my previous idea to directly mimic the printf (C) or format (C++) is not well-suited to Fortran due to absence of variadic functions.

Instead we just need these format specifier "adaptors", e.g. the C++ code:

 std::cout << std::format("Hello {}!\n", "world");

would be in Fortran

use iso_fortran_env, only: stdout => output_unit
write(stdout, format("Hello {}!\n")) "world"       ! obviously, this is a contrived example

One down-side is that since this would not be a built-in function, syntax errors in the format specifier or the number of arguments can't be caught at compile time. Instead, the program terminate at runtime with a potentially cryptic error message. To avoid this the format function would need to validate it's input first, and terminate with a helpful message, making it an impure function. Still there would be no way to protect against mismatch in the number of arguments. One could perhaps overcome these issues by making format a preprocessor macro, but this would again make it a non-portable solution.

I'd still argue that having a C-style format function would be welcome and make Fortran I/O easier in some situations. It would however be mostly the responsibility of the caller, to get the C formatting string right. For a programmer familiar with C or Python format specifiers, a few edit-compile-run cycles might be easier than learning the Fortran format specificiations.

@certik, do you think this is worth prototyping in LFortran at some point? What I mean is the proposed format function would still be a (non-standard) stdlib thing, but LFortran would offer compile-time format syntax checking. This would imply the compiler gives stdlib some type of elevated status.

@arjenmarkus
Copy link
Member

arjenmarkus commented Mar 16, 2021 via email

@ivan-pi
Copy link
Member

ivan-pi commented Mar 16, 2021

That said, one of the major drawbacks IMHO of C-style formats is the impossibility to group them and to have repetition: write(*, '(10(a,i5)' ) ( string(i), value(i) , i=1,100) for instance would have to be done using an explicit do-loop in C and some logic to add a newline at the right moment. Well, just my pet peeve :).

That is a great feature of Fortran indeed. I have been using the "infinite list" specifiers, such as "(*(I0,:,2X))" a lot in my work lately. I see the benefit of C-/Python-like formatting mainly when I want to combine textual output (sentences) with numeric values.

@epagone
Copy link

epagone commented Mar 16, 2021

Since this seems to require quite a bit of work regardless, why don't we aim at combining the best of both worlds, i.e. combining the "compactness" of C/Python style and the group repetitions of Fortran?

I have found an excellent comment by @14NGiestas that might be used as a starting point.

One could perhaps overcome these issues by making format a preprocessor macro, but this would again make it a non-portable solution.

If this can be achieved with fypp, I don't see a problem, since stdlib already depends on it.

PS: just realised that @ivan-pi already used even the same expression ("best of both worlds") in a previous comment above, sigh, sorry for the repetition.

@arjenmarkus
Copy link
Member

arjenmarkus commented Mar 16, 2021 via email

@ivan-pi
Copy link
Member

ivan-pi commented Mar 16, 2021

One could perhaps overcome these issues by making format a preprocessor macro, but this would again make it a non-portable solution.

If this can be achieved with fypp, I don't see a problem, since stdlib already depends on it.

The preprocessing would have to be done on the calling code. I don't think this is a viable option until fpm evolves to offer some default preprocessor. I think we should explore the format function first.

I found a few more comments in the proposal of @gronki: j3-fortran/fortran_proposals#69, the most relevant being:

Option 2: revive format as "function". The downside is that its quite a long word and would not be very clear when used directly in print/write/read. Example:
character(len = *), parameter :: fmt = format("important parameter = ", f6.1, " for n = ", i3)

I guess we will need to adopt a grammar (and use it to automatically create a parser), but it might be good to create a toy implementation first (with limited support of strings, integers, and reals) just to experiment with syntax.

So the Python formatting syntax begins by defining "replacement fields" which are the text portions surround by curly braces {}. The grammar of these is:

replacement_field ::=  "{" [field_name] ["!" conversion] [":" format_spec] "}"
field_name        ::=  arg_name ("." attribute_name | "[" element_index "]")*
arg_name          ::=  [identifier | digit+]
attribute_name    ::=  identifier
element_index     ::=  digit+ | index_string
index_string      ::=  <any source character except "]"> +
conversion        ::=  "r" | "s" | "a"
format_spec       ::=  <described in the next section>

The field_name and conversionfields allow referencing positional or named arguments, or performing a conversion, e.g. calling the repr(), str(), or ascii() methods prior to output. Some example of how the field names are used are shown below:

"First, thou shalt count to {0}"  # References first positional argument
"Bring me a {}"                   # Implicitly references the first positional argument
"From {} to {}"                   # Same as "From {0} to {1}"
"My quest is {name}"              # References keyword argument 'name'
"Weight in tons {0.weight}"       # 'weight' attribute of first positional arg
"Units destroyed: {players[0]}"   # First element of keyword argument 'players'.

Since the result of the stdlib format function would be detached from the actual write or print standard, I don't think we can support such usage cases. This leaves us with implicitly defined positional variables, and the definition of the format_spec part.

Addendum: positional arguments can be pursued with Arjen's workaround, but I guess this involves a custom write subroutine.

I learn that Intel Fortran actually supports some form of named "variable" interpolation, limited to format strings as one its extensions. With this extension you can do things such as:

write(*,'(3f15.3,<nvari>f9.2)') x,y,z,(var(i),i=1,nvari)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea Proposition of an idea and opening an issue to discuss it topic: utilities containers, strings, files, OS/environment integration, unit testing, assertions, logging, ...
Projects
None yet
Development

No branches or pull requests

5 participants