Added
-
Using definitions from the
intrinsic
module outside the prelude
now results in a warning. -
reduce_by_index
with vectorised operators (e.g.map2 (+)
) is
orders of magnitude faster than before. -
Executables generated with the
pyopencl
backend now support the
options--default-tile-size
,--default-group-size
,
--default-num-groups
,--default-threshold
, and--size
. -
Executables generated with
c
andopencl
now print a help text
if run with invalid options. Thepy
andpyopencl
backends
already did this. -
Generated executables now support a
--tuning
flag for passing
many tuned sizes in a file. -
Executables generated with the
cuda
backend now take an
--nvrtc-option
option. -
Executables generated with the
opencl
backend now take a
--build-option
option.
Removed
- The old
futhark-*
executables have been removed.
Changed
-
If an array is passed for a function parameter of a polymorphic
type, all arrays passed for parameters of that type must have the
same shape. For example, given a functionlet pair 't (x: t) (y: t) = (x, y)
The application
pair [1] [2,3]
will now fail at run-time. -
futhark test
now numbers un-named data sets from 1 rather than
0. This only affects the text output and the generated JSON
files, and fits the tuple element ordering in Futhark. -
String literals are now of type
[]u8
and contain UTF-8 encoded
bytes.
Fixed
-
An significant problematic interaction between empty arrays and
inner size declarations has been closed (#714). This follows a
range of lesser empty-array fixes from 0.9.1. -
futhark datacmp
now prints to stdout, not stderr. -
Fixed a major potential out-of-bounds access when sequentialising
reduce_by_index
(in most cases the bug was hidden by subsequent
C compiler optimisations). -
The result of an anonymous function is now also forbidden from
aliasing a global variable, just as with named functions. -
Parallel scans now work correctly when using a CPU OpenCL
implementation. -
reduce_by_index
was broken on newer NVIDIA GPUs when using fancy
operators. This has been fixed.