Skip to content

Version 19.12.0 (December 31, 2019)

Compare
Choose a tag to compare
@streichler streichler released this 01 Jan 00:32
· 11603 commits to stable since this release
  • Build
    • Both builds (Make and CMake) now generate legion_defines.h and
      realm_defines.h. By default these headers are generated in
      the source directory (Make) or build directory (CMake). This
      means that languages such as Regent and Python no longer
      require MAX_DIM to be specified explicitly
  • Regent
    • Support for CUDA 10
    • Support for field polymorphic tasks
    • Substantially improved the generality of the index launch
      optimization. Task arguments of the form p[i+k] may now be
      used, where k is a variable defined outside of the loop
    • Add flag -foverride-demand-index-launch which can be used to
      force loops to be index launched in cases where the compiler
      cannot prove the disjointness of read-write region
      arguments
    • Added reductions for complex64
    • The scripts install.py and setup_env.py now use CMake to
      build Terra by default, which should improve portability on
      most machines
    • The behavior of -fcuda 1 has changed: this flag will now issue
      an error if CUDA cannot be enabled (e.g. because the build
      does not support CUDA, or because the machine has no
      GPUs). Omitting this flag will now enable CUDA if it is
      available (and will not error if it is not available).
      The behavior of -fopenmp 1 has changed similarly.
    • The behavior of __demand(__cuda) has changed. This will now
      issue an error if a loop is not eligible for the CUDA
      transformation, regardless of whether CUDA is actually
      available on the current machine or not. The behavior of
      __demand(__openmp) has changed similarly.
    • The annotation __allow(__cuda) is now permitted, and permits
      (but does not require) tasks to be optimized with CUDA.
    • Experimental support for 2D kernel launch in the CUDA code generation
  • Python
    • Add support for copies
    • Copies and fills now support multiple fields
    • Tasks (including index launches) now support setting the mapper
      ID and tag
  • Legion
    • A major overhaul of the Legion physical analysis to use an
      approach based on bounding volume hierarchies. The change is
      not visible to users, but will likely impact performance. Most
      programs will get faster; programs that create many partitions
      frequently on the fly may get slower. The later case will be fixed
      in an upcoming release.
    • Added support for indirect copy operations such as gather and
      scatter onto existing copy launchers
  • Realm
    • Event::subscribe allows polling via Event::has_triggered to
      (eventually) succeed
    • Addition of CompletionQueue objects that allow multiple unordered
      Event triggers to be efficiently handled by a single consumer
    • Support for omp_get_level, omp_in_parallel, and
      omp_set_num_threads in tasks running on OpenMP processors
    • Support for unstructured scatter and/or gather in copies. (Handling
      structured cases as well as fills/reductions remains a work in
      progress.)
    • Removed all calls to Event::wait from inside other Realm API calls.
      Applications now must make sure that index spaces and instance
      metadata are valid before use. For details, see: #465