CHANGES IN R 4.1.0
# FUTURE DIRECTIONS:- It is planned that the 4.1.x series will be the last to support
32-bit Windows, with production of binary packages for that
series continuing until early 2023.
SIGNIFICANT USER-VISIBLE CHANGES:
- Data set esoph in package datasets now provides the correct
numbers of controls; previously it had the numbers of cases added
to these. (Reported by Alexander Fowler in PR#17964.)
NEW FEATURES:
-
www.omegahat.net is no longer one of the repositories known by
default to setRepositories(). (Nowadays it only provides source
packages and is often unavailable.) -
Function
package_dependencies()
(in package tools) can now use
different dependency types for direct and recursive dependencies. -
The checking of the size of tarball in
R CMD check --as-cran <pkg>
may be tweaked via the new environment variable
_R_CHECK_CRAN_INCOMING_TARBALL_THRESHOLD_
, as suggested in
PR#17777 by Jan Gorecki. -
Using
c()
to combine a factor with other factors now gives a
factor, an ordered factor when combining ordered factors with
identical levels. -
apply()
gains asimplify
argument to allow disabling of
simplification of results. -
The
format()
method for class "ftable" gets a new optionjustify
.
(Suggested by Thomas Soeiro.) -
New
...names()
utility. (Proposed by Neal Fultz in PR#17705.) -
type.convert()
now warns when its as.is argument is not
specified, as the help file always said it should. In that
case, the default is changed toTRUE
in line with its change in
read.table()
(related tostringsAsFactors
) in R 4.0.0. -
When printing list arrays, classed objects are now shown via
theirformat()
value if this is a short enough character string,
or by giving the first elements of their class vector and their
length. -
capabilities()
gets new entry "Rprof" which isTRUE
when R has
been configured with the equivalent of--enable-R-profiling
(as
it is by default). (Related to Michael Orlitzky's report
PR#17836.) -
str(xS4)
now also shows extraneous attributes of an S4 object
xS4. -
Rudimentary support for vi-style tags in
rtags()
andR CMD rtags
has been added. (Based on a patch from Neal Fultz in PR#17214.) -
checkRdContents()
is now exported from tools; it and also
checkDocFiles()
have a new optionchkInternal
allowing to check
Rd files marked with keyword "internal" as well. The latter can
be activated forR CMD check
via environment variable
_R_CHECK_RD_INTERNAL_TOO_
. -
New functions
numToBits()
andnumToInts()
extend the raw
conversion utilities to (double precision) numeric. -
Functions
URLencode()
andURLdecode()
in package utils now work
on vectors of URIs. (Based on patch from Bob Rudis submitted
with PR#17873.) -
path.expand()
can expand~user
on most Unix-alikes even when
readline is not in use. It tries harder to expand~
, for example
should environment variableHOME
be unset. -
For HTML help (both dynamic and static), Rd file links to help
pages in external packages are now treated as references to
topics rather than file names, and fall back to a file link only
if the topic is not found in the target package. The earlier rule
which prioritized file names over topics can be restored by
setting the environment variable_R_HELP_LINKS_TO_TOPICS_
to a
false value. -
c()
now removesNULL
arguments before dispatching to methods,
thus simplifying the implementation ofc()
methods, but for
back compatibility keepsNULL
when it is the first argument.
(From a report and patch proposal by Lionel Henry in PR#17900.) -
Vectorize()
's result function's environment no longer keeps
unneeded objects. -
Function
...elt()
now propagates visibility consistently with
..n
. (Thanks to Lionel Henry's PR#17905.) -
capture.output()
no longer uses non-standard evaluation to
evaluate its arguments. This makes evaluation of functions like
parent.frame()
more consistent. (Thanks to Lionel Henry's
PR#17907.) -
packBits(bits, type="double")
now works as inverse of
numToBits()
. (Thanks to Bill Dunlap's proposal in PR#17914.) -
curlGetHeaders()
has two new arguments, timeout to specify the
timeout for that call (overridinggetOption("timeout")
) and TLS
to specify the minimum TLS protocol version to be used for
https://
URIs (inter alia providing a means to check for sites
using deprecated TLS versions 1.0 and 1.1). -
For
nls()
, an optional constantscaleOffset
may be added to the
denominator of the relative offset convergence test for cases
where the fit of a model is expected to be exact, thanks to a
proposal by John Nash.nls(*, trace=TRUE)
now also shows the
convergence criterion. -
Numeric differentiation via
numericDeriv()
gets new optional
argumentseps
andcentral
, the latter for taking central divided
differences. The latter can be activated fornls()
via
nls.control(nDcentral = TRUE)
. -
nls()
now passes thetrace
andcontrol
arguments togetInitial()
,
notably for all self-starting models, so these can also be fit in
zero-noise situations via ascaleOffset
. For this reason, the
initial function of a selfStart model must now have...
in its
argument list. -
bquote(splice = TRUE)
can now splice expression vectors with
attributes: this makes it possible to splice the result of
parse(keep.source = TRUE)
. (Report and patch provided by Lionel
Henry in PR#17869.) -
textConnection()
gets an optionalname
argument. -
get()
,exists()
, andget0()
now signal an error if the first
argument has length greater than 1. Previously additional
elements were silently ignored. (Suggested by Antoine Fabri on
R-devel.) -
R now provides a shorthand notation for creating functions, e.g.
\(x) x + 1
is parsed asfunction(x) x + 1
. -
R now provides a simple native forward pipe syntax
|>
. The
simple form of the forward pipe inserts the left-hand side as the
first argument in the right-hand side call. The pipe
implementation as a syntax transformation was motivated by
suggestions from Jim Hester and Lionel Henry. -
all.equal(f, g)
for functions now by default also compares their
environment(.)
s, notably via newall.equal
method for class
function. Comparison ofnls()
fits, e.g., may now need
all.equal(m1, m2, check.environment = FALSE)
. -
.libPaths()
gets a new optioninclude.site
, allowing to not
include the site library. (Thanks to Dario Strbenac's suggestion
and Gabe Becker's PR#18016.) -
Lithuanian translations are now available. (Thanks to Rimantas
Zakauskas.) -
names()
now works forDOTSXP
objects. On the other hand, in
R-lang, the R language manual, we now warn against relying on the
structure or even existence of such dot-dot-dot objects. -
all.equal()
no longer gives an error onDOTSXP
objects. -
capabilities("cairo")
now applies only to the file-based devices
as it is now possible (if very unusual) to build R with Cairo
support for those but not forX11()
. -
There is optional support for tracing the progress of
loadNamespace()
- see its help. -
(Not Windows.)
l10n_info()
reports an additional element, the
name of the encoding as reported by the OS (which may differ from
the encoding part (if any) of the result from
Sys.getlocale("LC_CTYPE")
. -
New function
gregexec()
which generalizesregexec()
to find all
disjoint matches and all substrings corresponding to
parenthesized subexpressions of the given regular expression.
(Contributed by Brodie Gaslam.) -
New function
charClass()
in package utils to query the
wide-character classification functions in use (such as
iswprint
). -
The names of
quantile()
's result no longer depend on the global
getOption("digits")
, but quantile() gets a new optional argument
digits = 7
instead. -
grep()
,sub()
,regexp
and variants work considerably faster for
long factors with few levels. (Thanks to Michael Chirico's
PR#18063.) -
Provide grouping of
x11()
graphics windows within a window
manager such as Gnome or Unity; thanks to a patch by Ivan Krylov
posted to R-devel. -
The
split()
method for class data.frame now allows thef
argument
to be specified as a formula. -
sprintf
now warns on arguments unused by the format string. -
New palettes "Rocket" and "Mako" for
hcl.colors()
(approximating
palettes of the same name from the viridisLite package).Contributed by Achim Zeileis.
-
The base environment and its namespace are now locked (so one can
no longer add bindings to these or remove from these). -
Rterm handling of multi-byte characters has been improved,
allowing use of such characters when supported by the current
locale. -
Rterm now accepts ALT+ +xxxxxxxx sequences to enter Unicode
characters as hex digits. -
Environment variable
LC_ALL
on Windows now takes precedence over
LC_CTYPE
and variables for other supported categories, matching
the POSIX behaviour. -
duplicated()
andanyDuplicated()
are now optimized for integer
and real vectors that are known to be sorted via the ALTREP
framework. Contributed by Gabriel Becker via PR#17993.
GRAPHICS:
-
The graphics engine version,
R_GE_version
, has been bumped to 14
and so packages that provide graphics devices should be
reinstalled. -
Graphics devices should now specify
deviceVersion
to indicate
what version of the graphics engine they support. -
Graphics devices can now specify
deviceClip
. IfTRUE
, the
graphics engine will never perform any clipping of output itself.The clipping that the graphics engine does perform (for both
canClip = TRUE
andcanClip = FALSE
) has been improved to avoid
producing unnecessary artifacts in clipped output. -
The grid package now allows
gpar(fill)
to be alinearGradient()
,
aradialGradient()
, or apattern()
. Theviewport(clip)
can now
also be a grob, which defines a clipping path, and there is a new
viewport(mask)
that can also be a grob, which defines a mask.These new features are only supported so far on the Cairo-based
graphics devices and on thepdf()
device. -
(Not Windows.) A warning is given when a Cairo-based type is
specified for apng()
,jpeg()
,tiff()
orbmp()
device but Cairo
is unsupported (sotype = "Xlib"
is tried instead). -
grSoftVersion()
now reports the versions of FreeType and
FontConfig if they are used directly (not via Pango), as is
most commonly done on macOS.
C-LEVEL FACILITIES:
- The standalone libRmath math library and R's C API now provide
log1pexp()
again as documented, and gainlog1mexp()
.
INSTALLATION on a UNIX-ALIKE:
-
configure
checks for a programpkgconf
if programpkg-config
is
not found. These are now only looked for on the path (like
almost all other programs) so if needed specify a full path to
the command inPKG_CONFIG
, for example in fileconfig.site
. -
C99 function
iswblank
is required - it was last seen missing ca
2003 so the workaround has been removed. -
There are new configure options
--with-internal-iswxxxxx
,
--with-internal-towlower
and--with-internal-wcwidth
which allows
the system functions for wide-character classification,
case-switching and width (wcwidth
andwcswidth
) to be replaced by
internal ones. The first has long been used on macOS, AIX (and
Windows) but this enables it to be unselected there and selected
for other platforms (it is the new default on Solaris). The
second is new in this version of R and is selected by default on
macOS and Solaris. The third has long been the default and
remains so as it contains customizations for East Asian
languages.System versions of these functions are often minimally
implemented (sometimes only for ASCII characters) and may not
cover the full range of Unicode points: for example Solaris (and
Windows) only cover the Basic Multilingual Plane. -
Cairo installations without X11 are more likely to be detected by
configure, when the file-based Cairo graphics devices will be
available but notX11(type = "cairo")
. -
There is a new configure option
--with-static-cairo
which is the
default on macOS. This should be used when only static cairo
(and where relevant, Pango) libraries are available. -
Cairo-based graphics devices on platforms without Pango but with
FreeType/FontConfig will make use of the latter for font
selection.
LINK-TIME OPTIMIZATION on a UNIX-ALIKE:
-
Configuring with flag
--enable-lto=R
now also uses LTO when
installing the recommended packages. -
R CMD INSTALL
andR CMD SHLIB
have a new flag--use-LTO
to use
LTO when compiling code, for use with R configured with
--enable-lto=R
. For R configured with--enable-lto
, they have
the new flag--no-use-LTO
.Packages can opt in or out of LTO compilation via a UseLTO
field in theDESCRIPTION
file. (As usual this can be overridden
by the command-line flags.)
BUILDING R on Windows:
-
for GCC >= 8,
FC_LEN_T
is defined inconfig.h
and hence character
lengths are passed from C to Fortran in inter alia BLAS and
LAPACK calls. -
There is a new text file
src/gnuwin32/README.compilation
, which
outlines how C/Fortran code compilation is organized and
documents new features:-
R can be built with Link-Time Optimization with a suitable
compiler - doing so with GCC 9.2 showed several
inconsistencies which have been corrected. -
There is support for cross-compiling the C and Fortran code
in R and standard packages on suitable (Linux) platforms.
This is mainly intended to allow developers to test later
versions of compilers - for example using GCC 9.2 or 10.x has
detected issues that GCC 8.3 in Rtools40 does not. -
There is experimental support for cross-building R packages
with C, C++ and/or Fortran code.
-
-
The R installer can now be optionally built to support a single
architecture (only 64-bit or only 32-bit).
PACKAGE INSTALLATION:
-
The default C++ standard has been changed to C++14 where
available (which it is on all currently checked platforms): if
not (as before) C++11 is used if available otherwise C++ is not
supported.Packages which specify C++11 will still be installed using C++11.
C++14 compilers may give deprecation warnings, most often for
std::random_shuffle
(deprecated in C++14 and removed in C++17).
Either specify C++11 (see 'Writing R Extensions') or modernize
the code and if needed specify C++14. The latter has been
supported since R 3.4.0 so the package'sDESCRIPTION
would need
to include something likeDepends: R (>= 3.4)
PACKAGE INSTALLATION on Windows:
-
R CMD INSTALL
andR CMD SHLIB
make use of their flag--use-LTO
when theLTO_OPT
make macro is set in fileetc/${R_ARCH}/Makeconf
or in a personal/site Makevars file. (For details see 'Writing R
Extensions' SS4.5.)This provides a valuable check on code consistency. It does work
with GCC 8.3 as in Rtools40, but that does not detect everything
the CRAN checks with current GCC do.
PACKAGE INSTALLATION on macOS:
-
The default personal library directory on builds with
--enable-aqua
(including CRAN builds) now differs by CPU type,
one of~/Library/R/x86_64/x.y/library ~/Library/R/arm64/x.y/library
This uses the CPU type R (and hence the packages) were built for,
so when a x86_64 build of R is run under Rosetta emulation on an
arm64 Mac, the first is used.
UTILITIES:
-
R CMD check
can now scan package functions for bogus return
statements, which were possibly intended as return() calls (wish
of PR#17180, patch by Sebastian Meyer). This check can be
activated via the new environment variable
_R_CHECK_BOGUS_RETURN_
, true for--as-cran
. -
R CMD build
omits tarballs and binaries of previous builds from
the top-level package directory. (PR#17828, patch by Sebastian
Meyer.) -
R CMD check
now runs sanity checks on the use ofLazyData
, for
example that a data directory is present and that
LazyDataCompression
is not specified withoutLazyData
and has a
documented value. For packages with large LazyData databases
without specifyingLazyDataCompression
, there is a reference to
the code given in 'Writing R Extensions' SS1.1.6 to test the
choice of compression (as in all the CRAN packages tested a
non-default method was preferred). -
R CMD build
removesLazyData
andLazyDataCompression
fields from
theDESCRIPTION
file of packages without a data directory.
ENCODING-RELATED CHANGES:
-
The parser now treats
\Unnnnnnnn
escapes larger than the upper
limit for Unicode points (\U10FFFF
) as an error as they cannot be
represented by valid UTF-8.Where such escapes are used for outputting non-printable
(including unassigned) characters, 6 hex digits are used (rather
than 8 with leading zeros). For clarity, braces are used, for
example\U{0effff}
. -
The parser now looks for non-ASCII spaces on Solaris (as
previously on most other OSes). -
There are warnings (including from the parser) on the use of
unpaired surrogate Unicode points such as \uD834. (These cannot
be converted to valid UTF-8.) -
Functions
nchar()
,tolower()
,toupper()
andchartr()
and those
using regular expressions have more support for inputs with a
marked Latin-1 encoding. -
The character-classification functions used (by default) to
replace the systemiswxxxxx
functions on Windows, macOS and AIX
have been updated to Unicode 13.0.0.The character-width tables have been updated to include new
assignments in Unicode 13.0.0. This included treating all
control characters as having zero width. -
The code for evaluating default (extended) regular expressions
now uses the same character-classification functions as the rest
of R (previously they differed on Windows, macOS and AIX). -
There is a build-time option to replace the system's
wide-characterwctrans
C function by tables shipped with R: use
configure option--with-internal-towlower
or (on Windows)
-DUSE_RI18N_CASE
inCFLAGS
when building R. This may be needed
to allowtolower()
andtoupper()
to work with Unicode characters
beyond the Basic Multilingual Plane where not supported by system
functions (e.g. on Solaris where it is the new default). -
R is more careful when truncating UTF-8 and other multi-byte
strings that are too long to be printed, passed to the system or
libraries or placed into an internal buffer. Truncation will no
longer produce incomplete multibyte characters.
DEPRECATED AND DEFUNCT:
-
Function
plclust()
from the package stats and
package.dependencies()
,pkgDepends()
,getDepList()
,
installFoundDepends()
, andvignetteDepends()
from package tools
are defunct. -
Defunct functions
checkNEWS()
andreadNEWS()
from package tools
andCRAN.packages()
from utils have been removed. -
R CMD config CXXCPP
is defunct (it was deprecated in R 3.6.2). -
parallel::detectCores()
drops support for Irix (retired in 2013). -
The
LINPACK
argument tochol.default()
,chol2inv()
,
solve.default()
andsvd()
has been defunct since R 3.1.0. It was
silently ignored up to R 4.0.3 but now gives an error. -
Subsetting/indexing, such as
ddd[*]
orddd$x
on aDOTSXP
(dot-dot-dot) objectddd
has been disabled; it worked by accident
only and was undocumented.
BUG FIXES:
-
Many more C-level allocations (mainly by
malloc
andstrdup
) are
checked for success with suitable alternative actions. -
Bug fix for
replayPlot()
; this was turning off graphics engine
display list recording if a recorded plot was replayed in the
same session. The impact of the bug became visible if resize the
device after replay OR if attempted anothersavePlot()
after
replay (empty display list means empty screen on resize or empty
saved plot). -
R CMD check
etc now warn when a package exports non-existing S4
classes or methods, also in case of no "methods" presence.
(Reported by Alex Bertram; reproducible example and patch by
Sebastian Meyer in PR#16662.) -
boxplot()
now also accepts calls for labels such as ylab, the
same asplot()
. (Reported by Marius Hofert.) -
The help page for
xtabs()
now correctly states thataddNA
is
settingna.action = na.pass
among others. (Reported as PR#17770
by Thomas Soeiro.) -
The
R CMD check <pkg>
gives a longer and more comprehensible
message whenDESCRIPTION
misses dependencies, e.g., inImports:
.
(Thanks to the contributors of PR#17179.) -
update.default()
now calls the genericupdate()
on the formula to
work correctly for models with extended formulas. (As reported
and suggested by Neal Fultz in PR#17865.) -
The horizontal position of leaves in a dendrogram is now correct
also withcenter = FALSE
. (PR#14938, patch from Sebastian
Meyer.) -
all.equal.POSIXt()
no longer warns about and subsequently ignores
inconsistent "tzone" attributes, but describes the difference in
its return value (PR#17277). This check can be disabled via
the new argumentcheck.tzone = FALSE
as suggested by Sebastian
Meyer. -
as.POSIXct()
now populates the "tzone" attribute from its tz
argument whenx
is a logical vector consisting entirely ofNA
values. -
x[[2^31]] <- v
now works. (Thanks to the report and patch by
Suharto Anggono in PR#17330.) -
In log-scale graphics,
axis()
ticks and label positions are now
computed more carefully and symmetrically in their range,
typically providing more ticks, fulfilling wishes in PR#17936.
The change really corresponds to an improvedaxisTicks()
(package
grDevices), potentially influencing grid and lattice, for
example. -
qnorm(<very large negative>, log.p=TRUE)
is now correct to at
least five digits where it was catastrophically wrong,
previously. -
sum(df)
and similar "Summary"- and "Math"-group member functions
now work for data frames df with logical columns, notably also of
zero rows. (Reported to R-devel by Martin "b706".) -
unsplit()
had trouble with tibbles due to unsound use ofrep(NA, len)
-indexing, which should useNA_integer_
(Reported to R-devel
by Mario Annau.) -
pnorm(x, log.p = TRUE)
underflows to-Inf
slightly later. -
show(<hidden S4 generic>)
prints better and without quotes for
non-hidden S4 generics. -
read.table()
and relatives treated an"NA"
column name as missing
whencheck.names = FALSE
PR#18007. -
Parsing strings containing UTF-16 surrogate pairs such as
"\uD834\uDD1E"
works better on some (uncommon) platforms.
sprintf("%X", utf8ToInt("\uD834\uDD1E"))
should now give"1D11E"
on all platforms. -
identical(x,y)
is no longer true for differingDOTSXP
objects,
fixing PR#18032. -
str()
now works correctly forDOTSXP
and related exotics, even
when these are doomed.Additionally, it no longer fails for lists with a class and
"irregular" method definitions such that e.g.lapply(*)
will
necessarily fail, as currently for different igraph objects. -
Message translation domains, e.g., for errors and warnings, are
now correctly determined also when e.g., a base function is
called from "top-level" function (i.e., defined inglobalenv()
),
thanks to a patch from Joris Goosen fixing PR#17998. -
Too long lines in environment files (e.g.,
Renviron
) no longer
crash R. This limit has been increased to 100,000 bytes.
(PR#18001.) -
There is a further workaround for FreeType giving incorrect
italic font faces with cairo-based graphics devices on macOS. -
add_datalist(*, force = TRUE)
(from package tools) now actually
updates an existingdata/datalist
file for new content. (Thanks
to a report and patch by Sebastian Meyer in PR#18048.) -
cut.Date()
andcut.POSIXt()
could produce an empty last interval
forbreaks = "months"
orbreaks = "years"
. (Reported as PR#18053
by Christopher Carbone.) -
Detection of the encoding of 'regular' macOS locales such as
en_US
(which is UTF-8) had been broken by a macOS change:
fortunately these are now rarely used withen_US.UTF-8
being
preferred. -
sub()
andgsub(pattern, repl, x, *)
now keep attributes of x such
as names() also when pattern isNA
(PR#18079). -
Time differences ("difftime" objects) get a replacement and a
rep()
method to keep"units"
consistent. (Thanks to a report and
patch by Nicolas Bennett in PR#18066.) -
The
\RdOpts
macro, setting defaults for\Sexpr
options in an Rd
file, had been ineffective since R 2.12.0: it now works again.
(Thanks to a report and patch by Sebastian Meyer in PR#18073.) -
mclapply
andpvec
no longer accidentally terminate parallel
processes started before bymcparallel
or related calls in
package parallel (PR#18078). -
grep
and other functions for evaluating (extended) regular
expressions handle in Unicode also strings not explicitly flagged
UTF-8, but flagged native when running in UTF-8 locale. -
Fixed a crash in fifo implementation on Windows (PR#18031).
-
Binary mode in fifo on Windows is now properly detected from
argument open (PR#15600, PR#18031).