Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change constDecDef implementation in Python #2179

Closed
bmaclach opened this issue Jun 12, 2020 · 20 comments · Fixed by #3858
Closed

Change constDecDef implementation in Python #2179

bmaclach opened this issue Jun 12, 2020 · 20 comments · Fixed by #3858
Assignees
Labels
artifacts newcomers Good first issue to work on!

Comments

@bmaclach
Copy link
Collaborator

Currently constDecDef, GOOL's function for generating code for declaring and defining a constant, is implemented in Python by rendering just a plain variable declaration, since the concept of "constants" isn't actually supported by Python.

Since constDecDef is not actually doing what it is seems like it should in Python, we may want to change the implementation. We could:

  1. Have the Python instance of constDecDef throw an error.
  2. Follow the Python convention of naming variables that are supposed to be constant with all capital letters. In this case, the Python renderer would still generate regular variable definitions for constDecDef, but would rename the variable to be all caps.

I think I prefer option 2, because with option 1 we would not be able to generate Python code for any example where we have chosen Const as the ConstantRepresentation. Thoughts, @JacquesCarette and @smiths?

@JacquesCarette
Copy link
Owner

I agree with option 2. Including the renaming to follow Python conventions.

@smiths
Copy link
Collaborator

smiths commented Jun 13, 2020

I like option 2. In fact, I think we should think of option 2 for all programming languages. The convention that constants are in ALL_CAPS seems to be a universal convention.

@balacij balacij added newcomers Good first issue to work on! artifacts labels Apr 26, 2023
@hrzhuang hrzhuang self-assigned this May 3, 2023
@hrzhuang
Copy link
Collaborator

hrzhuang commented May 3, 2023

I think there is a distinction to be made here between "constant" and "variable that is not to be mutated."

Examples of the former include mathematical and scientific constants such as $\pi$ and the Planck's constant, in addition to program-specific constants usually declared at the module level such as the length of a fixed-length array. It is generally idiomatic to name these in ALL_CAPS.

However, in C++ in particular, it is also idiomatic to declare as const any local variable that the programmer does not intend to mutate. Consider discriminant in the C++ code below.

std::pair<double, double> solve_quadratic(double a, double b, double c) {
  const double discriminant = b*b - 4*a*c;
  return { (-b + std::sqrt(discriminant)) / (2*a), (-b - std::sqrt(discriminant)) / (2*a) };
}

It would not be idiomatic to name these in ALL_CAPS in C++ or Python. As far as I'm aware, the idiomatic thing to do in Python is to simply declare an ordinary variable, and just not mutate it.

As I understand it, constDecDef relates to the second kind of "immutable variable," since it is a statement that is local to a function body. The first kind of "true constants" should in general occur at module level, and we would declare them using the constVar method of StateVarSym, as opposed to constDecDef.

So I think constDecDef simply generating an ordinary variable declaration in Python is actually completely idiomatic?

@balacij

This comment was marked as outdated.

@balacij
Copy link
Collaborator

balacij commented May 3, 2023

One more (aside) issue: I think we generate constants in Python using a carriage class, which actually breaks PEP8 (see the link above), it should be defined on the module-level instead.

@JacquesCarette
Copy link
Owner

I guess we might need to differentiate between "forever" constants (like $pi$) and things that are constants within the context of the generated program's run-time. (Phrasing it that way also makes it likely that other 'kinds' of constants will appear, with their own notion of when they exist and are constant.)

Because some languages have conventions for naming certain kinds of constants, we're going to have to teach Drasil about this too, and make sure we treat the various kinds appropriately.

[This is fun! I'd never thought about context-sensitive notions of constant before, but now that the issue has been brought up, it makes sense.]

@hrzhuang
Copy link
Collaborator

hrzhuang commented May 3, 2023

Thank for @balacij for the clarification! I understand that my notions of "true constant" and "immutable variable" are the same on a technical level, but the distinction I'm trying to make is a semantic one. const is a tool for making a variable immutable, but what is our reason for using it? $\pi$ is const because it has one universally agreed-upon value that never changes, and discriminant in my example above is const because we want to do our best to remove ways in which we can shoot ourselves in the foot (by mutating something that doesn't need to be mutated in this case). Note that nothing about the semantics of discriminant is immutable: we can calculate it using mutation as follows.

double discriminant = b*b;
discriminant -= 4*a*c;

But we apply const to it because the calculation could be done without mutation, and that's generally a good practice.

Another way I look at this is through the lens of "scope of immutability." $\pi$ has the same value everywhere it is used, but discriminant takes on a different value for each invocation of solve_quadratic. I think when PEP 8 refer to "constants," they mean constants like $\pi$ (I take their comment that constants "are usually declared on a module level" as a clue). Another reason I think this is that if you consider how the solve_quadratic function would be implemented in Python, I don't think you would capitalize discriminant based on all of the Python code I have seen, some of which comes from talks given by core Python developers.

@balacij
Copy link
Collaborator

balacij commented May 3, 2023

Constant QDefinitions are meant to be globally scoped constants, which is why they are expected to use the const keyword where possible, and all capital letters at least wherever else. The pure math drawn up by the SRSs don't really make a distinction. For the most part, all variables solved in the IM scheme are expected to be immutable (to my knowledge) once calculated/only assigned once. I have a ticket I have to write about the design of the solution we pull from the SRSs, I'll make sure to reference this later 😊 @JacquesCarette and @smiths can probably answer this better if they have time. Otherwise, I'm sure your questions will naturally reoccur soon.

Thank you, that's a good example. Regarding the issue of C++ and Python locally defined immutable variables, you're right that it's less common in C++.

Note that we're generating a program modelled from a pure, declarative solution schematic.

@hrzhuang hrzhuang removed their assignment May 3, 2023
@B-rando1
Copy link
Collaborator

B-rando1 commented May 9, 2024

I saw this issue has the newcomers category, but I couldn't quite follow where the discussion ended. Is there something in here that I can work on?

@B-rando1 B-rando1 self-assigned this May 9, 2024
@balacij
Copy link
Collaborator

balacij commented May 9, 2024

@B-rando1 and I discussed this in person, to sum up our discussion:

In the context of OO languages, there is a difference between a "constant" variable and an "immutable variable."

Conceptually, I think it's normal to assume they're the same concept (but that could just be me...) but, in OO, where references often exist, "constant" more so corresponds to (as @hrzhuang mentioned) "cannot be re-assigned," while immutable refers to both "cannot be re-assigned" and "its attributes [and their attributes and so on] cannot be altered."
For example, global database references should refer to unique databases for specific purposes (e.g., "only store long-term things here [cold data], and use another for caching information [hot data]") and so we wouldn't want people to re-assign them, but we would want them to have an internal state (e.g., open or closed database connection) and at times that state should be mutable (e.g., if the database connection is closed unexpectedly, we can try to reconnect). So, in the context of OO languages, constant =/= immutable. However, this idea of "constant" is not only found in variables...

For example, when developing an OO program (in a "statically-typed" language), ...

  • it does not make sense to discuss "constant" OO classes because we don't modify class definitions at runtime anyway (the JVM has instruments for editing a class definition while code is loaded, however),
  • it does make sense to discuss "constant" methods because it refers to the idea that subclasses cannot overwrite them (again, the "cannot be re-assigned" meaning),
  • it does make sense to discuss "immutable classes" because it really means "classes that define the shape of immutable data," however, it is not often found (to my knowledge) in OO languages, it is usually implicit knowledge referring to a class attribute chain involving strictly "constant" data -- that is, from a new class definition, we cannot define it to be immutable without first editing all of its attributes to also be immutable first, and we can't have that automatically done for us from the class we define itself,
  • to discuss an immutable method, we should rather refer to the definition itself (which would mean the same as 'constant') or the body of the function (always immutable), and
  • we often need to discuss when a class hierarchy should stop permitting extension -- this is done with the "final" keyword in both Java and C++.

In Java, "final" is (normally) the keyword that means "write once, and now (eager)" but can also mean "no more extension," while in C++ and others, "const" is used to mean "write once, and now (eager)." In both, "write once, and now (eager)" are good for declaring "unique and important" pieces of data that can still be mutated themselves. Personally, I think "final" (or even "immutable") is the more appropriate wording for "write once, and now (eager)" (or, like Rust, where variables are default immutable, using "mut" to mark one as mutable), and "inextensible" or "terminal" should probably be used to indicate that a class definition cannot be extended further. Furthermore, I think "constant" should be reserved for things that are constant in the mathematical sense (known and unchanging), and "mutability" should be reserved for contexts where we permit mutation, which I haven't experienced in mathematics.

Ok, now, returning to the conversation:

  1. As @smiths mentioned, it seems like a universal convention in programming languages -- global constants should all be named in an uppercase format. Furthermore, I think that global variables should refer to immutable data too (global mutatable, but "constant" variables complicate testing, readability, and maintenance a lot). So, in the previous sentence "global constants" means "global, immutable data."
  2. I'm not sure what the names should be, but in GOOL, we should probably be able to refer to each of these concepts (discussed above) individually without overloading "constant."
  3. As @hrzhuang mentioned, local "write once" variables should use the normal lowercase naming schemes (even if the variable refers to immutable data too).

What do we think?

On "next steps," I think we first need to survey what features GOOL has related to this and understand its meanings across supported languages against the above discussion (which I think is very doable now), and also survey where constDecDef and other definitions appear in Drasil.

... we might also consider splitting this ticket into a few 😄

@B-rando1
Copy link
Collaborator

@balacij you mentioned the possibility of "immutable classes", which would define the shape of immutable data, but you aren't aware of any languages that do this. I've been learning Julia, and by default its structs are immutable (and it sounds like Rust is similar in that all variables are immutable by default). That is, any primitive fields in the struct can only be set once, unless the mutable keyword is added when declaring the struct. The tricky part is that for structs holding other structs, only the reference is immutable, so you can have an immutable struct carrying a mutable struct, as below.

julia> mutable struct Unsafe
       num::Real
       end

julia> struct Safe
       unsafe::Unsafe
       end

julia> a = Safe(Unsafe(5))
Safe(Unsafe(5))

julia> a.unsafe
Unsafe(5)

julia> a.unsafe.num
5

julia> a.unsafe.num = 7
7

The last line would have thrown an error if Unsafe was considered immutable in that context.

Anyway, I guess we might want to consider if we should add an "immutable class" constructor to GOOL (or do it some other way) in order to take advantage of this feature of Julia.

@JacquesCarette
Copy link
Owner

Anyway, I guess we might want to consider if we should add an "immutable class" constructor to GOOL (or do it some other way) in order to take advantage of this feature of Julia.

We might be forced to. This is how a number of features appeared in GOOL: we added a new language that enforced a PL concept that was, at best, previously optional. The various concepts already mentioned by @balacij (immutable, write-once, etc) end up being made explicit in GOOL because the various target languages have their own rules about these things.

The tricky part is that most languages deal with these concepts to various degree of implicitness (but in incompatible ways), so that in GOOL, we have no choice but to be very explicit.

@samm82
Copy link
Collaborator

samm82 commented May 27, 2024

Going back to the "original" issue of constants in Python, it seems that there may actually be a way to implement constants ("read-only" values) in Python: https://realpython.com/python-constants/#defining-strict-constants-in-python

@balacij
Copy link
Collaborator

balacij commented May 27, 2024

I will certainly remember the @property decorator for the future, thanks @samm82! That would be handy if/when we decide to build immutable objects in Python. I don't think it's a very common pattern for Python namespaces/global constants, but it could certainly work for them too.

@B-rando1
Copy link
Collaborator

I did some digging around to see what it would take to auto-capitalize constants in Python. On the surface it seems pretty easy: just change the capitalization for constDecDef and constant, and in constDecDef add an additional check to make sure that the capitalized name doesn't conflict with anything. I got that working in #3858, and it actually works perfect in the GOOL tests.

I ran into a couple larger issues in drasil-code, though.

  • It seems inconsistent with how it refers to constants. I couldn't quite track everything down, but it seems like the main problem is when constants are stored in a class. When they are created they are created using stateVarDef (constant ...), which means they are capitalized when they are created. When they are read from, however, they are referred to using staticVar, which has no sense whether it is constant; so they are not capitalized. I think in order to fix this, we would need to either add a parameter to staticVar or create a staticConstVar.
  • In drasil-code, in classVariable, it gets the name from a variable and then checks that name against its list of variable names. Since the Python renderer changes that name, it throws an error. I can think of a few ways we might be able to deal with this, but they all feel a little hacky.
    • We could modify the type signatures involved to get the original name to the classVariable function so that it can look it up directly. Because of how many functions call it though, this would not be easy, and would look a bit messy.
    • I might be able to change how I implement constants so that while it renders in all uppercase, it still says its name is the lowercase. I should look into that some more.

Is this still a change we want to make? It'll take a little bit of work, but it seems doable.

@B-rando1
Copy link
Collaborator

I thought about it some more and decided to try a bit more. It actually was quite a bit easier to make the necessary changes (see them in #3858) than I expected. I think on the whole this change is for the better.

Some things to note:

  • While the Python renderer does check to see if making a constant's name uppercase would conflict with any existing variable names, it does not check against variable names declared after the constant. So it's still not perfectly safe. With that said, I think we do have all the tools we need to put in more rigorous checking for name conflicts, which would enable greater safety.
  • I'm not sure how I feel about having staticVar hold the original variableName while making the Doc that it uses to render itself all uppercase. It's probably mostly harmless, but there's always a chance of things going wrong down the road when we start having parts of the program lie to other parts. Again, it might be harmless, but it can be hard to tell.

@JacquesCarette
Copy link
Owner

So how close are we to being able to close this?

@samm82
Copy link
Collaborator

samm82 commented Jul 21, 2024

While the Python renderer does check to see if making a constant's name uppercase would conflict with any existing variable names, it does not check against variable names declared after the constant. So it's still not perfectly safe. With that said, I think we do have all the tools we need to put in more rigorous checking for name conflicts, which would enable greater safety.

@balacij's comment on #806 suggested that a Stateful renderer would help in generating references in the order used in the document. I wonder if a Stateful renderer could help here as well to be able to deal with potential variable conflicts across ALL variables 🤔

@B-rando1
Copy link
Collaborator

@JacquesCarette the original purpose of this issue (capitalizing constants in Python) will be fulfilled once #3858 is passed. There may be a few things we want to change with the fix (i.e. better name conflict detection, throwing warnings rather than errors, and enabling GOOL to know how it renamed a variable without making assumptions). The fix in its current form might be good enough for now, though - it works with everything we have so far.

As for the 'different kinds of constants' discussion we dug up (summarized well by @balacij's comment above), there is still a lot of work to be done. That discussion is enough for its own issue though, so I can open one in the next day or two.

Should I make #3858 close this issue? There are a few 'rough edges' to the fix that should be improved upon at some point. It might be better to make separate issue(s) for those though, as they seem to have applications beyond this issue.

@JacquesCarette
Copy link
Owner

Yes, please make #3858 close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
artifacts newcomers Good first issue to work on!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants