Skip to content

HPC GAP Plan for migration

Olexandr Konovalov edited this page Mar 7, 2023 · 10 revisions

Based on the discussion at Unifying GAP4 and HPC-GAP, we now are planning concrete actions to achieve the unification. This page was started during the First joint GAP-Sage Days and should be updated with progress and new tasks as time progresses.

People currently involved in this:

  • Olexandr Konovalov
  • Chris Jefferson
  • Max Horn
  • Markus Pfeiffer
  • Steve Linton
  • Reimer Behrends

If you would like to help, too, contact us (e.g. via the GAP mailing list).

Next meeting

There will be another Skype meeting, where we will discuss the progress made so far, and further actions. Tentative date:

Friday, March 11, 2016, at 13:30 CET (12:30 in St Andrews).

Relevant links

Tasks

  • Done: Move HPC-GAP specific files in src and lib to src/hpc respectively src/hpc subdirectories

    • Who: Markus
    • Notes: A goal is to have these directories have identical content on hpcgap-default and master, with the help of the HPCGAP #define
  • Done: Refactor garbage collection code (GASMAN, weak pointers, etc.)

    • Who: Chris
    • Notes: The idea is to undo some of the already done "merging" in e.g. gasman.c, as that made the already difficult even harder to read. Instead, there probably will be two (or more) files for the different GC implementations
  • In progress: Port the struct containing all globals from hpcgap-default to master

    • Who: Markus
    • Notes:
      • This could lead into refactoring the interpreter, so that its state is kept in a separate global struct, which then could be passed between the interpreter functions as an argument. That would be an important step towards the ability to have multiple interpreter running.
      • Constraint: If all globals are in a single struct, then all types they use need to be known before. This means we may end up having additional #include interdependencies... But well, so what.
      • See pull request #579
  • Identify and document remaining differences

    • Who: everybody
    • Notes:
      • We need to know what differences there remain, and how to deal with them
      • Trivial ones can of course be unified immediately, but by now, in the kernel at least there are mostly the "non-obvious" ones left. Once we know what they are, we can discuss for each how to get rid of it, and who will do it.
    • I went over the src/ directory using the tool meld and kept some notes of differences that I spotted. I am quite optimistic that we'll be able to sync up the kernel sources soon, in particular when the build system is sorted. I kept my notes here.
    • Subtasks:
      • Deal with gvars.c differences
        • Who: ???
        • Notes: The HPC-GAP code is quite different than the classic code. The easiest solution would be to always use the HPC-GAP code. For that, we need to figure out whether it is slower and/or has a memory overhead. Somebody could create a branch based on master and then transplant the HPC-GAP gvars.c implementation there. Then, conduct some performance experiments to compare the two branches
      • Deal with object transversal differences
        • Who: ???
        • Notes: It seems that this is already mostly restricted to files in src/hpc; the only file affected outside of it seems to be objects.c.
      • Reduce compiler differences
        • Who: ???
        • Notes: The compiler.c code is slightly different in HPC-GAP, as it deals with atomic objects. To unify that, we probably can simply copy the changes from HPC-GAP, but augment them to also emit #ifdef HPCGAP / #endif lines.
  • In progress: Build system

    • Who: Max
    • Notes:
      • The first step will be to properly document all our requirements for this.
      • Details on this can be found here: GAP buildsystem.
      • The next step then will be to implement this on both branches (ideally, that would mean: implement it on one of the two branches, then just copy it to the other, with some final tweaks)
  • Design and implementation of a "caching API"

    • Who: ???
    • Notes: This is part of the goal to avoid atomic in the library as much as possible. For some ideas on the new API, see Unifying GAP4 and HPC-GAP. We'll need to do roughly this:
      1. implement a first draft of the caching API
      2. write tests cases for it
      3. select a few places in the kernel to use it,
      4. write test cases for each of these,
      5. convert the places to actually use the cache API (naturally, the tests should still pass).
      6. determine in how far this affects performance
      7. based on all the experience from the previous points, revise the caching API as needed
      8. convert more code to use it, repeat
      9. possibly implement a C version of the API for performance,
    • We do not want to be fancy here (so in particular, no attempts to convert things to use hash tables -- the new code should in each instance use the same caching strategy as it used before, just encapsulated in calls to NewCache/LookupCache. This means we may end up with multiple variants of these, one for each existing caching strategy. (Of course we can try to partially hide that away behind method dispatch, but we need to keep performance in mind).
  • Backquotes operator

    • Who: ???
    • Notes:
      • Some people dislike the backquotes resp. backtick operator
      • it also has the problem that it actually modifies its arguments, which some people feel is very surprising and can be misused
      • Originally, it was mainly intended for literals. So, one idea would be to restrict it to string literals, perhaps taking an idea from the Python / C++ / ... playbook, which allow string literally prefixes. So, we could e.g. allow writing i"blah" to create an immutable string literal. This would then also open the door for future prefixes, e.g. u"blah" to designate a Unicode string (whatever that is supposed to be)
      • However, we also use this on lists and records. So the next idea is to restrict it to all kinds of literals, defined recursively. But that opens up a can of worms. Should e.g. [ 1+2 ] be backquoteable?
      • So, another variation is to restrict backquote for list and record literals to those whose contents are already immutable. This can only be determined at run time (not during parsing), but is much more flexible. On the downside, for nested structures, one would need to use multiple backquotes, e.g.:
        mat1 := `[ [1,2], [3,4]];`    # runtime error
        mat2 := `[ `[1,2], `[3,4]];`  # OK
        list := `[ mat2 ];`           # also OK
        
  • Investigate places in HPC-GAP where objects were turned immutable compared to classic GAP

    • Who: ???
    • Notes:
      • While it is helpful for HPC-GAP when things are immutable, and this in fact makes sense in many places (e.g. prevents users from accidentally shooting themselves in any number of feet), one needs to be careful not to inconvenience people by making otherwise legitimate changes harder.
        • of course nobody disagrees with this...
        • ... but also nobody knows about examples where this hypothetical problem occurs. That is not to say it doesn't happen.
        • If anybody finds problems here, they should be treated like any other issue.
  • CopyVec... vs ConvertVec..., etc.

    • Who: ???
    • Notes:
      • There is the whole in-place conversion business, with ConvertToVectorRep etc., which needs to be rethought in HPC-GAP, and which causes considerable diffs...
      • In some cases, this may be the right solution, but in others, it may end up wasting memory.
      • Also, we may need additional CopyFOO() functions
  • Remove diffs in Conway data

    • Who: Olexandr
    • Notes: We can probably get rid of a most of these many diffs by refactoring the code (possible getting rid of PREPARE_CONWAY_DATA)
  • Convert demo directory in hpcgap-default branch to proper tests

    • Who: Olexandr
Clone this wiki locally