Skip to content

Releases: diku-dk/futhark

0.18.3

12 Nov 08:24
Compare
Choose a tag to compare

Fixed

  • Python backend now disables spurious NumPy overflow warnings for
    both library and binary code (#1180).

  • Undid deadlocking over-synchronisation for freeing opaque objects.

  • futhark datacmp now handles bad input files better (#1181).

0.18.2

07 Nov 22:40
Compare
Choose a tag to compare

Added

  • The GPU loop tiler can now handle loops where only a subset of the
    input arrays are tiled. Matrix-vector multiplication is one
    important program where this helps (#1145).

  • The number of threads used by the multicore backend is now
    configurable (--num-threads and
    futhark_context_config_set_num_threads()). (#1162)

Fixed

  • PyOpenCL backend would mistakenly still streat entry point
    argument sizes as 32 bit.

  • Warnings are now reported even for programs with type errors.

  • Multicore backend now works properly for very large iteration
    spaces.

  • A few internal generated functions (init_constants(),
    free_constants()) were mistakenly declared non-static.

  • Process exit code is now nonzero when compiler bugs and
    limitations are encountered.

  • Multicore backend crashed on reduce_by_index with nonempty target
    and empty input.

  • Fixed a flattening issue for certain complex map nestings
    (#1168).

  • Made API function futhark_context_clear_caches() thread safe
    (#1169).

  • API functions for freeing opaque objects are now thread-safe
    (#1169).

  • Tools such as futhark dataset no longer crash with an internal
    error if writing to a broken pipe (but they will return a nonzero
    exit code).

  • Defunctionalisation had a name shadowing issue that would crop up
    for programs making very advanced use of functional
    representations (#1174).

  • Type checker erroneously permitted pattern-matching on string
    literals (this would fail later in the compiler).

  • New coverage checker for pattern matching, which is more correct.
    However, it may not provide quite as nice counter-examples
    (#1134).

  • Fix rare internalisation error (#1177).

0.16.5

04 Nov 21:25
Compare
Choose a tag to compare

Fixed

  • Made API function futhark_context_clear_caches() thread safe
    (#1169).

  • API functions for freeing opaque objects are now thread-safe
    (#1169).

0.18.1

08 Oct 14:32
Compare
Choose a tag to compare

Added

  • Experimental multi-threaded CPU backend, multicore.

Changed

  • All sizes are now of type i64. This has wide-ranging
    implications and most programs will need to be updated (#134).

0.17.3

06 Oct 11:59
Compare
Choose a tag to compare

Added

  • Improved parallelisation of futhark bench compilation.

Fixed

  • Dataset generation for test programs now use the right futhark
    executable (#1133).

  • Really fix NaN comparisons in interpreter (#1070, again).

  • Fix entry points with a parameter that is a sum type where
    multiple constructors contain arrays of the same statically known
    size.

  • Fix in monomorphisation of types with constant sizes.

  • Fix in in-place lowering (#1142).

  • Fix tiling inside multiple nested loops (#1143).

0.17.2

19 Sep 10:14
Compare
Choose a tag to compare

Added

  • Obscure loop optimisation (#1110).

  • Faster matrix transposition in C backend.

  • Library code generated with CUDA backend can now be called from
    multiple threads.

  • Better optimisation of concatenations of array literals and
    replicates.

  • Array creation C API functions now accept const pointers.

  • Arrays can now be indexed (but not sliced) with any signed integer
    type (#1122).

  • Added --list-devices command to OpenCL binaries (#1131)

  • Added --help command to C, CUDA and OpenCL binaries (#1131)

Removed

  • The integer modules no longer contain iota and replicate
    functions. The top-level ones still exist.

  • The size module type has been removed from the prelude.

Changed

  • Range literals may no longer be produced from unsigned integers.

Fixed

  • Entry points with names that are not valid C (or Python)
    identifiers are now pointed out as problematic, rather than
    generating invalid C code.

  • Exotic tiling bug (#1112).

  • Missing synchronisation for in-place updates at group level.

  • Fixed (in a hacky way) an issue where reduce_by_index would use
    too much local memory on AMD GPUs when using the OpenCL backend.

0.16.4

27 Aug 14:35
Compare
Choose a tag to compare

Added

  • #[unroll] attribute.

  • Better error message when writing a[i][j] (#1095).

  • Better error message when missing "in" (#1091).

Fixed

  • Fixed compiler crash on certain patterns of nested parallelism
    (#1068, #1069).

  • NaN comparisons are now done properly in interpreter (#1070).

  • Fix incorrect movement of array indexing into branches ifs
    (#1073).

  • Fix defunctorisation bug (#1088).

  • Fix issue where loop tiling might generate out-of-bounds reads
    (#1094).

  • Scans of empty arrays no longer result in out-of-bounds memory
    reads.

  • Fix yet another defunctionalisation bug due to missing
    eta-expansion (#1100).

0.16.3

30 Jul 09:44
Compare
Choose a tag to compare

Added

  • random input blocks for futhark test and futhark bench now
    support floating-point literals, which must always have either an
    f32 or f64 suffix.

  • The cuda backend now supports the -d option for executables.

  • The integer modules now contain a ctz function for counting
    trailing zeroes.

Fixed

  • The pyopencl backend now works with OpenCL devices that have
    multiple types (most importantly, oclgrind).

  • Fix barrier divergence when generating code for group-level
    colletive copies in GPU backend.

  • Intra-group flattening now looks properly inside of branches.

  • Intra-group flattened code versions are no longer used when the
    resulting workgroups would have less than 32 threads (with default
    thresholds anyway) (#1064).

0.16.2

15 Jul 12:54
Compare
Choose a tag to compare

Added

  • futhark autotune: added --pass-option.

Fixed

  • futhark bench: progress bar now correct when number of runs is
    less than 10 (#1050).

  • Aliases of arguments passed for consuming parameters are now
    properly checked (#1053).

  • When using a GPU backend, errors are now properly cleared.
    Previously, once e.g. an out-of-bounds error had occurred, all
    future operations would fail with the same error.

  • Size-coercing a transposed array no longer leads to invalid code
    generation (#1054).

0.16.1

07 Jul 09:53
Compare
Choose a tag to compare

Added

  • Incremental flattening is now performed by default. Use
    attributes to constrain and direct the flattening if you have
    exotic needs. This will likely need further iteration and
    refinement.

  • Better code generation for reverse (and the equivalent explicit
    slice).

  • futhark bench now prints progress bars.

  • The cuda backend now supports similar profiling as the opencl
    option, although it is likely slightly less accurate in the
    presence of concurrent operations.

  • A preprocessor macro FUTHARK_BACKEND_foo is now defined in
    generated header files, where foo is the name of the backend
    used.

  • Non-inlined functions (via #[noinline]) are now supported in GPU
    code, but only for functions that exclusively operate on
    scalars.

  • futhark repl now accepts a command line argument to load a
    program initially.

  • Attributes are now also permitted on declarations and specs.

  • futhark repl now has a :nanbreak command (#839).

Removed

  • The C# backend has been removed (#984).

  • The unsafe keyword has been removed. Use #[unsafe] instead.

Changed

  • Out-of-bounds literals are now an error rather than a warning.

  • Type ascriptions on entry points now always result in opaque types
    when the underlying concrete type is a tuple (#1048).

Fixed

  • Fix bug in slice simplification (#992).

  • Fixed a typer checker bug for tracking the aliases of closures
    (#995).

  • Fixed handling of dumb terminals in futhark test (#1000).

  • Fixed exotic monomorphisation case involving lifted type
    parameters instantiated with functions that take named parameters
    (#1026).

  • Further tightening of the causality restriction (#1042).

  • Fixed alias tracking for right-operand operator sections (#1043).