From e4592868396efc408ac41d5d5b987726931adcc2 Mon Sep 17 00:00:00 2001 From: Peter Hill Date: Fri, 13 May 2022 15:10:51 +0100 Subject: [PATCH] Add first draft of scoped enums --- .../scoped_enumerators/scoped-enumerators.txt | 287 ++++++++++++++++++ 1 file changed, 287 insertions(+) create mode 100644 proposals/scoped_enumerators/scoped-enumerators.txt diff --git a/proposals/scoped_enumerators/scoped-enumerators.txt b/proposals/scoped_enumerators/scoped-enumerators.txt new file mode 100644 index 0000000..864f57b --- /dev/null +++ b/proposals/scoped_enumerators/scoped-enumerators.txt @@ -0,0 +1,287 @@ +To: J3 J3/XX-XXX +From: Peter Hill +Subject: Scoped Enumerators +Date: 2022-05-13 +#Reference: + +Overview +======== + +The authors _strongly_ believe that the current proposal for typed +enumerators, as described in J3/21-120 and subsequent revisions, be +amended so that enumerator names are class two names. That is, +enumerator names must be local to each `enumeration type` and not in +the surrounding scope. Some suggestions for syntax are made. + +The rationale for making enumerator names class one (defined in the +enclosing scope) is that this is "simpler". This is unfortunately too +simple, and, unlike many of the other proposed and withdrawn features, +if this becomes standardised it will be irrevocable. Extensible enums, +arbitrary enumerator values, and underlying types could all be added +to future standard versions, but unscoped enums are forever. Code +written using unscoped typed enums would not be compatible with a +future standard that introduced scoped typed enums (or worse, a third +type of enum). + +Requiring enumerator names to be class two (defined local to the +type), and the values accessed through some sort of namespace (for +example, ` % `), will provide benefits consistent with +typed enums while still keeping open the possibility of a mechanism to +"hoist" names into the enclosing scope (for example, see the C++ +"Using enum" paper P1099R5 which introduced this mechanism into C++20: +https://atomgalaxy.github.io/using-enum/using-enum.html). + +As an additional benefit, introducing a syntax for +type-namespace-access (for example the familiar component access `%`) +leaves open the future possibility of a syntax for accessing +`parameter` components or `nopass` type-bound procedures without the +need for an instance of the type (usually called "static access" or +similar in other programming languages). + +The rest of this paper will briefly review enums in other languages, +the history of the typed enum proposal in Fortran, and the +justification for changing to class two names. + +A brief review of `enum` in other programming languages +======================================================= + +Enumerated types appear in many high-level programming languages, from +Ada to Python. Of the most popular languages with a built-in `enum` +facility, the majority introduce enumerator names within a unique +scope for each enumeration type, so that referring to a particular +enumerator value requires a namespace or class prefix. + +Languages which use global or enclosing scope: + +- Ada +- C +- Go +- Pascal (some implementations enable local scopes instead) +- Raku + + +Languages which use local or own scope: + +- C# +- C++ (with `enum class` in C++11) +- D +- Dart +- Java +- PHP +- Python +- Rust +- Swift +- Typescript + + +Of the two above lists, it is notable that all of the languages that +use a local scope are strongly typed, with the exception of PHP. While +Ada, Go, and Pascal are also strongly typed, Ada provides a facility +for resolving conflicting enumerator names, and the Free Pascal +implementation supports scoped enums. + +Some of these languages go even further and make their enums be +fully-fledged sum or algebraic types, where each enumerator may have a +different type. + +Other languages, such as Javascript and CommonLisp, do not have a +language-level enum but have other mechanisms for emulating them. + +Modern strongly typed languages that have enums have almost uniformly +made them scoped enums, where enumerators must be accessed through the +enum type. Some languages also provide mechanisms for abbreviated +access: C++ has `using enum`, Swift allows dropping the type name in +certain contexts where it is unambiguous. + + +History and current status of scoped enumerations in Fortran +============================================================ + +`enum` was introduced in Fortran 2003 to enable interoperability +between constants defined in Fortran and C. Similar to `enum` in C, +the enumerators in Fortran's `enum` are defined in the enclosing +scope. + +The following proposals (and their revisions) discuss enhanced `enum`s +in some fashion: + +- 18-114: Enumeration types +- 18-256: enums +- 19-216: Enumeration types [US21] +- 19-230: Formal Requirements for True Enumeration types +- 19-231: Formal Requirements for True Enumeration types +- 19-232: Formal Syntax for True Enumeration types +- 19-239: Consider extended scope for enumeration types? +- 19-249: Syntax for True Enumeration types +- 21-110: Enumeration types discussion specs and syntax +- 21-111: US-21 Typed enumerators +- 21-120: Simpler enumeration types specs and syntax +- 21-121: Enum types edits +- 21-132: Simpler enumeration types, edits +- 21-179: Enumerator accessibility and constructor +- 21-182: US 21 Enumeration type. +- 21-183: US 21 Enum type name. +- 21-186: is not connected to high-level + syntax +- 21-189: Interoperable enum types, additional specs/syntax/edits +- 21-190: Editorial/technical fixes, mostly for enum types +- 22-114: First Enumerator in an enumeration type should allow a + value. +- 22-122: Enumeration type constructor + +The first paper, 18-114, suggests that enumerator names are introduced +into the enclosing scope but provides a mechanism for resolving conflicts: + +> Example: +> +> ENUMERATION, ORDERED :: COLORS +> ENUMERATOR :: WHITE, BROWN, GREEN +> END ENUMERATION COLORS +> +> ENUMERATION, ORDERED :: NAMES +> ENUMERATOR :: GREEN, BROWN, WHITE +> END ENUMERATION NAMES +> +> NAMES(GREEN) is a value of type NAMES. COLORS(GREEN) is a value of +> type COLORS. Their numeric representations are different. GREEN +> alone is prohibited because its type is ambiguous. + +19-230r2 has a looser requirement: + +> F. The name of an enumerator must be unique within the enumeration type, +> but may be the same as the name of an enumerator of another type. +> Some mechanism shall allow access to both enumerators from a +> scope. + +19-231r2 goes further and recommends scoped enumerators: + +> 13. [F] An enumerator is a constant expression. +> +> 14. [F] Accessing the name of an enumeration type by use association does +> not make any of its enumerator names available (because +> enumerator names are not class (1) names). +> +> 15. [F] The syntax for denoting an enumerator shall involve the +> enumeration type name and no other class (1) name. If the +> enumeration type name is inaccessible, this syntax shall not +> be available. + +19-232 suggests some syntax: + +> (e) Syntax for "qualified denotation", i.e. specifying an enumerator within +> a specific named enumeration type. As the enumerators live inside the +> type, just use the type name followed by the type-access token % +> and the enumerator name, e.g. +> MY_ENUM % HIDDEN_ENUMERATOR_NAME +> +> ALTERNATIVE SYNTAX: +> (e1) Use a new token instead of reusing %; e.g. +> MY_ENUM ` HIDDEN_ENUMERATOR_NAME +> +> (e2) Use the type constructor with the inaccessible enumerator name as its +> argument, e.g. +> MY_ENUM ( HIDDEN_ENUMERATOR_NAME ) +> +> COMMENT: e2 syntax is ambiguous if MY_ENUM is to be treated as a generic +> function, as we do with derived types since Fortran 2003. Even if +> there is no ambiguity in a particular example, it promulgates +> confusion between local entity names and names inside a type. + +The withdrawn paper 19-249 notes that, in regard to the `MY_ENUM % +HIDDEN_ENUMERATOR_NAME` syntax: + +> {We do not ever use " %" for anything.} + +Then in 21-120r3, the scope requirement is removed: + +> (g) Enumerators come without any enhanced namespace management, i.e. their +> names are normal class one names. + +with the following comments: + +> (13) I note that enumerators as class one names is not only simpler +> but what WG5 asked us to do in the first place. +> (13) Fortran already has basic namespace management (USE ONLY and +> renaming). Some ideas for extensions to that have already been +> floated (for a future revision, possibly F202y); we should not +> preempt that with a complicated feature here. + +21-179r2 has proposed specification wording for name conflicts: + +> [529:20 19.2.1p3 Classes of local identifiers] After (7.5.10) insert +> "and enumerators of different enumeration types may have the same +> name. If enumerators of different enumeration types that have the +> same name are accessible in a scoping unit, they shall not appear +> within that scoping unit except as the in an +> ." + +This is the current state of scope of enumerator names. + +Why enumerator names must be class two in F202X +=============================================== + +Consider the following example: + +``` +module integration + enumeration type :: integration_result_t + enumerator :: ok, unconverged, too_many_iterations + end enumeration type integration_result_t + + enumeration type :: io_result_t + enumerator :: ok, no_directory, wrong_permission + end enumeration type io_result_t + +contains + subroutine write_integration(array) + real, intent(inout), dimension(:) :: array + type(integration_result_t) :: integration_result + type(io_result_t) :: io_result + + integration_result = integrate(array) + ! Not possible with current specification + if (integration_result /= ok) error stop "Bad integration" + + io_result = write_array(array) + ! Not possible with current specification + if (io_result /= ok) error stop "Couldn't write file" + end subroutine write_integration + + : +end module integration +``` + +With the current proposed specification, these two enums cannot be +used within the module that defines them. The second comment (13) on +21-120r3 says that `USE ONLY` and renaming are sufficient, which +suggests a workaround would be to put the definitions of +`integration_result_t` and `io_result_t` in separate modules to each +other. + +This, however, has at least two downsides: enums would have to be +physically separated from where they logically belong; renaming would +have to be done in every scope where they are used. Both impose +burdens on the code maintainers and readers, complicating code +structure and allowing for inconsistent renaming in different +locations. + +Another workaround would be for the enumerators to have globally +unique names, and this is indeed the advice for C developers. This +again imposes additional burdens on the maintainers to ensure names +really are unique. + +A further downside of making enumerator names be class one is that +using them in another module would require naming each one in a `USE +ONLY` statement. Using the enums defined in the above example, we +would have: + +``` +module time_solver + use integration, only : integration_result_t, ok, unconverged, & + too_many_iterations + : +``` + +while class two names, only the enum type itself would need to be +imported and its values would be accessible through the enum type +name.