Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"of" word is not needed. #491

Closed
jordwalke opened this issue May 19, 2016 · 25 comments
Closed

"of" word is not needed. #491

jordwalke opened this issue May 19, 2016 · 25 comments

Comments

@jordwalke
Copy link
Member

type something = 
   | Constructor of int int
   | Another of (list int);

Multiple people have suggested removing the need for the of keyword. Sounds great to me, if possible and doesn't introduce grammar conflicts.

@kayceesrk

Tagging this as "good first task" for newcomers in the community to give it a shot. It should be relatively straightforward.

@SanderSpies
Copy link
Contributor

Should exception also change?

For example:

exception Key_not_found of string

would become

exception Key_not_found string

@yunxing
Copy link
Contributor

yunxing commented May 19, 2016

I think we should also change exception for consistency.

@kayceesrk
Copy link
Contributor

Exceptions should also change. exception Key_not_found of string is really syntactic sugar for extending the inbuilt extensible variant type exn and is equivalent to type exn += Key_no_found of string.

@SanderSpies
Copy link
Contributor

Alright, I want to take a go at this issue.

@SanderSpies
Copy link
Contributor

SanderSpies commented May 20, 2016

So I've been breaking my head on this issue - as this is really unfamiliar territory for me. I made a change to reason_parser.mly (removed OF for generalized_constructor_arguments) and I'm running into a reduce/reduce issue.

This is the reason_parser.conflicts that I'm getting:

** Conflict (reduce/reduce) in state 96.
** Token involved: LPAREN
** This state is reached from type_kind after reading:

EQUAL UIDENT 

** The derivations that appear below have the following common factor:
** (The question mark symbol (?) represents the spot where the derivations begin to differ.)

type_kind 
(?)

** In state 96, looking ahead at LPAREN, reducing production
** constr_ident -> UIDENT 
** is permitted because of the following sub-derivation:

EQUAL constructor_declarations 
      constructor_declaration_no_leading_bar 
      constr_ident generalized_constructor_arguments attributes // lookahead token appears because generalized_constructor_arguments can begin with LPAREN
      UIDENT . 

** In state 96, looking ahead at LPAREN, reducing production
** mod_ext_longident -> UIDENT 
** is permitted because of the following sub-derivation:

EQUAL core_type 
      mark_position_typ(_core_type) 
      _core_type 
      core_type2 
      mark_position_typ(_core_type2) 
      _core_type2 
      non_arrowed_core_type 
      non_arrowed_non_simple_core_type 
      mark_position_typ(_non_arrowed_non_simple_core_type) 
      _non_arrowed_non_simple_core_type 
      type_longident non_arrowed_simple_core_type_list 
      mod_ext_longident DOT LIDENT 
      mod_ext_longident LPAREN mod_ext_longident RPAREN // lookahead token appears
      UIDENT . 

Note that I added type_kind as a start.

I think it's related to tuples - but I'm not 100% sure.

Any hints on how to proceed with this would be very welcome.

@yunxing
Copy link
Contributor

yunxing commented May 21, 2016

@SanderSpies You are right, it is related to tuples.

Basically with the proposed syntax, the compiler cannot differentiate these two cases when the look ahead token is the highlighted ( :
type a = A (int, int);

type a = A (B).a;

When it gets to the highlighted (, it's confused by saying "I just read an A, should I be parsing it as a constructor name by reducing it to constr_ident (the first case) , or should I be parsing it as a module name by reducing it to mod_ext_longident (the second case) ? "

The debug log tells you how the compiler gets to each case.

@jordwalke
Copy link
Member Author

@yunxing it's not clear why the presence or removal of "of" would change that ambiguity.

@nmn
Copy link

nmn commented May 22, 2016

I'm a little torn on this. Reason has a great syntax to bring new people into the ML fold. Removing the 'of' would be shorter and cleaner, but it will also make it slightly more unreadable for people unfamiliar with ML-like languages.

I was surprised when I showed this syntax to a Java developer and he just got it because of the of keyword there. So while typing I'd rather not have to needlessly write of, as a reader of the code, it probably helps more than it hurts.

@jordwalke
Copy link
Member Author

@nmn I would not have predicted that. Doesn't it help that the variant tag is capitalized, and the rest are lowercase?

@nmn
Copy link

nmn commented May 24, 2016

@jordwalke I personally have no problem reading that myself, but I have experience reading functional ML-like languages. But when I was showing Reason of to not-so-functional developers, the of made things WAY more understandable for them. I'm sure once someone gets used to it, the of isn't that useful. But having it there helps people new to the language as it read more like English.

As I said, I'm torn!

@jordwalke
Copy link
Member Author

I wonder if those people were confused that expressions were written "Some x" instead of "Some of x"

@nmn
Copy link

nmn commented May 24, 2016

I'll do some more research on this.

@nmn
Copy link

nmn commented May 25, 2016

@jordwalke So most people agree that the of makes it read more like English, but they will also get used to it after using it for a bit that it should remain confusing after a few days.

Someone compared it to the function keyword in Javascript which a lot of us hated typing before ES6. But of is just two characters long and maybe it's not that much of a problem.

On the other hand, a lot of people find punning confusing ever since ES6, so I'm not sure what the right answer is.

@jordwalke
Copy link
Member Author

Okay, I think it's about tied. I will now go ask some tie breakers. I don't think it's a make or break decision but all else equal a kind of poll is a good signal.

@jordwalke
Copy link
Member Author

I asked an audience of experienced OCamlers and the unanimous feedback was that of should in fact be removed. This especially carries weight since people with experience in a language would tend to want to avoid change. I think this is a good change.

@yunxing
Copy link
Contributor

yunxing commented May 25, 2016

@jordwalke That sounds good.

@SanderSpies Do you want to take a try on this one as a puzzle? I actually have a working version on my local branch but it may be good for you to try to solve it so you can get a good understanding of our parser - printer - error message workflow.

We've discussed where to start on IRC yesterday. Feel free to ask any question either on IRC or here.

@SanderSpies
Copy link
Contributor

@yunxing Yeah, I'd like to work on this one to get a better feeling on how to solve reduce/reduce issues. I expect to be able to focus on this on Friday.

@ghost
Copy link

ghost commented May 28, 2016

One of the things I do not like about this style of type declaration (with or without the of) is that it is not consistent with how type signatures look in the rest of the language (e.g., for functions). While removing the of might be a nice change for experienced OCaml programmers, I would prefer something a bit more radical so long as syntax changes are still under consideration.

My suggestion would be to simply remove this style of type declaration from Reason altogether and replace it with the GADT style.

With the GADT style, the original example would look like this:

type something = 
  | Constructor: (int, int) -> something
  | Another: list int -> something
  ;

I think this is better for a few reasons.

First, it doesn’t require learning a different syntax for declaring constructors for algebraic types. It’s just the same syntax you use for functions. Even with the current style, you have to know that the of corresponds to the : and you have to understand that there is an implied return type (namely something).

Second, it would make the language simpler since there wouldn’t need to be a separate style for declaring GADT and non-GADTs. In general, I am in favor of simplifying the language and consolidating the syntax and this seems like a good opportunity to do that. In OCaml, removing the non-GADT syntax would be pretty much impossible now because of all the legacy code but Reason doesn’t have this problem yet.

The main downside to the GADT syntax is that it is a bit more verbose. Personally I think that is a worthwhile tradeoff.

@hcarty
Copy link
Contributor

hcarty commented May 28, 2016

@freebroccolo This syntax change makes sense if curried constructors are also added. Otherwise this would make the already initially confusing difference between constructor application and function application more confusing.

Making all variants GADTs is an interesting idea. Does removing the difference between ADTs and GADTs cause trouble anywhere for users?

@jordwalke
Copy link
Member Author

I would prefer something a bit more radical so long as syntax changes are still under consideration.

Of course. We can evolve the syntax fairly rapidly and upgrade all existing source trivially.

My suggestion would be to simply remove this style of type declaration from Reason altogether and replace it with the GADT style.

I have considered this and thought quite a bit about it actually. I wanted to do this, for exactly the reasons you mentioned (so that learning GADTs aren't such a huge step). The only reason I held off on it was because it's more verbose (I had the same proposal as you gave). I would like to find a syntax that not only unifies them but also is less verbose if possible. Any ideas? (That should probably be another github issue because it's independent of the choice of including of or not).

For anyone out there watching, the current syntax for GADTs is as follows (which is close to @freebroccolo's suggestion):

type something = 
  | Constructor of (int, int) : something
  | Another of (list int) : something;

The : was more intuitive to people (and we've gotten good feedback from OCaml devs), but that may be because the constructors are not functions. I wish they were, but for as long as they are not, perhaps the use of : makes more sense (which is why the OCaml devs had seen it as a favorable change).

#555

@jberdine
Copy link
Contributor

jberdine commented May 28, 2016 via email

@jordwalke
Copy link
Member Author

But if the parser mapped GADT syntax to ADT abstract syntax where possible, I guess that should not be an issue.

That was the thinking - that the syntax merely help people make the connection between ADTs and GADTs, where one is pretty much a special case of the other.

Type-based disambiguation of constructors works for ADTs but not
(nearly so well?) for GADTs.

Can you elaborate? Is this true for the subset of GADTs which do not exercise any GADT-ness such as:

type c = C of int : c

@jberdine
Copy link
Contributor

Type-based disambiguation of constructors works for ADTs but not
(nearly so well?) for GADTs.

Can you elaborate? Is this true for the subset of GADTs which do not exercise any GADT-ness such as:

type c = C of int : c

I'm not positive, but my impression is that the type-checker just
punts and doesn't try to disambiguate GADTs at all.

yunxing pushed a commit that referenced this issue Jun 8, 2016
Changes:

remove of from lexer
remove of from messages file
remove of from tests
remove of from printer
remove of from parser generator
@vramana
Copy link
Contributor

vramana commented Sep 4, 2016

I think this issue can closed @jordwalke

@jordwalke
Copy link
Member Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants