Announcing "rcli" - a command line tool to install and switch between R versions

Note: To avoid (questionable) third-party discussion tools, please post your thoughts and comments on this GitHub discussion thread.

GitHub repo: pat-s/rcli

Introduction

Ever since I am doing R development I am missing a generic way to install any R version on the command line. “Why does one need multiple R versions?” - one might ask.

Ever since the existence of packrat and the increased popularity of reproducibility, using one R version for a project and sticking to it until the end of a project became more popular. With the release of renv (the successor of packrat), using version-controlled packages and a fixed R version became even more popular. In addition, more and more people have access to an RStudio Workbench installation which let’s you switch between R versions very easily.

In fact, installing multiple versions and running a non-release version is quite standard in other languages, such as Python. Yet, doing so in R has never been easy - and we have not even touched the point of switching between different versions yet.

In the following I’ll outline the current state of installing R across operating systems and show how rcli is able to help here. Spoiler: Windows is not (yet) supported. It is quite a long (and a somewhat technical read) so if you’re short on time, just skip it.

The different ways to install R across operating systems

Linux

While creating rcli I came across the many differences that exist when trying to install R across operating systems. On Linux, R is included in the respective repositories of a distributions package manager. The version depends on the distribution and when running an LTS build, the R version might be quite outdated at some point. There are custom repositories one can add to stay on the most recent release. Yet the user does not have much power to decide when this will happen. If a system update is applied, R will also be upgraded.

Alternatively, one can install R from source into a custom location. This takes some minutes (around 10 - 20 depending on the CPU performance of the local machine) and requires to have all build-time dependencies installed that R requires. If this path is chosen, one can run different versions of R by calling R directly from the installed path. A cheap workaround is to define shell aliases, i.e., something like

r410="/some/path/to/R"

Semi-advanced Linux users will most likely have no problem doing the above, yet it takes time and quite some manual steps (which, of course, could also be automated).

A third option is to use the r-builds from RStudio. These are binaries which can be downloaded and extracted into the desired path. Scripting this will get one started relatively quickly and avoids source installations.

While the above might read logical to you, knowing all of these options required some experience in the ecosystem and can’t be expected to be known by the average user.

macOS

On macOS, most users probably use a manual way to install R, i.e., they go to the CRAN website, download the macOS installer and install it. This approach has the downside that it won’t update the R installation at all and unless users are repeating this task, they will be stuck on the R version first installed.

The second way it more programmatically and makes use of homebrew, the most popular package manager on macOS. Here, unfortunately, two very similar ways to install R exist which do very different things.

If a user runs brew install r, the homebrew R binary will be installed. While this version looks like a normal R installation in the first place, there is a major difference compared to the CRAN version: it does not support R binary packages for macOS. In addition it makes use of the gcc compiler instead of clang, which is macOS native C compiler and the one used by the CRAN installer. Now one can also install the CRAN one with homebrew - by calling brew install --cask r. This will download the .dmg installer from the CRAN website and install it in the same way as the manual GUI installation mentioned at the beginning would do - just silently. You might already see the unfortunate similarity I mentioned - only when passing the --cask flag, users will get the CRAN version. This is, of course, not obvious at all and most people are not even aware that this option exists.

Next, let’s talk about the CRAN installer and its library paths. The installer cannot be installed into a different location than the one suggested at /Library/Frameworks/R.framework - otherwise R will not start. In addition, installing a different version of R (no matter if older or newer) will override the previously installed one. There is a way to prevent this, which is to pass the --forget when using pkgutil, i.e., when installing the .pkg via the command line (source).

With respect to libraries, R on macOS behaves differently than on other operating systems and, arguably, not in a positive way. If one uses the CRAN installer, admin privileges are required. While this is nothing unusual, the user has also “write” permission to the R base packages after the installation, more specifically the R system library if the user who is running R is also an “Administrator” of the machine (details: because the path is owned by root:admin and “Administrators” on macOS belong to the “admin” group and the directory permission grant “write” permission to the listed group). The case of the installing user being equal to the user running R is quite often the case as most Macs are probably used by a single person. This leads to the fact that R packages which the user installs after the initial installation of R will be installed into the system library. Also the user potentially has permission to remove packages from the R core installation, e.g., the base package. Doing so would result in a corrupted state and the user would not be able to start R anymore. “How is this done on other operating systems?” - you may ask. Good question! On other OS (both Windows and Linux), the system library is not writable by the user and users are forced to install into a “user library,” creating a clear distinction between “user” packages and essential R-core packages.

Now the peculiar part about this fact is that R is, during startup, also looking for the “user library” and a “site library” via the environmental variables R_LIBS_USER and R_LIBS_SITE. Their defaults are

# user
"~/Library/R/arm64/4.1/library"
# site
/Library/Frameworks/R.framework/Resources/site-library

Yet they are silently ignored if the paths do not exist, i.e., if the user did not create them on purpose, R won’t use it but won’t complain either. Thus, most of the people using R on macOS will most likely use a single library which contains both the R-core packages and custom user packages. Interestingly (and fortunately), this assumption does not match with the Twitter poll I created:

Besides the fact that this opens the door for a possible destructive action by the user for R itself, it also causes more inconveniences: If users upgrade their R version - even just from one patch release to the next (e.g., from 4.1.1 to 4.1.2), they need to re-install all their packages. The R installer will overwrite the contents of the system library (e.g. /Library/Frameworks/R.framework/Resources/library) and if no user library exists (which could have been reused just fine in this case) all packages need to be installed again. Besides this being a tedious task, it also requires bandwidth and CPU resources for no reason and could be avoided easily.

Last, the fact that the installation does always use the exact same root directory and does not differentiate on the “patch” version level makes is impossible to install different patch versions of the same minor version side-by-side according to the official installation instructions. But don’t worry - rcli got you covered.

In a nutshell: the R installation procedure on macOS has room for improvement.

Windows

Windows does not suffer from the “administration” and “multiple installers” issues as on macOS and can even be installed into any directory. While most users will probably use the .exe installer from the CRAN website, command line installers like scoop exist which simplify the installation. Yet this approach will also suffer from the same “auto-update” issues as outlined in the previous sections and, to my knowledge, the installation of multiple versions side-by-side is not easily possible. Installing from source is very cumbersome as by default Windows does not provide all the build-time dependencies R needs. With respect to library handling, the situation is way better than on macOS because users are forced to create a user library in all cases and previously installed versions are not overwritten.

Installing R versions with rcli

rcli aims to simplify all of the above by offering a generic way to install any R version on any Unix-based system (yes, Windows is not yet covered but might support might be added in the future). Here are some examples:

rcli install 3.6.3
rcli install devel
rcli install release
rcli install 4.1.1 --arch x86_64

Besides the ability to use aliases to install R versions, rcli also makes it possible to choose whether to install the new arm64 version of R (on supported machines) or the “old” x86_64 version. Since R 4.1.0 both versions exist in parallel and there are still known problems when using the arm64 version for some packages, having the option to easily install both architecture builds side-by-side is neat. In case you are wondering why rcli “allows” installing “patch” versions despite this being claimed to be impossible on macOS - read on!

Switching between R versions with rcli

Let’s assume one managed to install multiple versions on some OS. Great! But wait - how does one make use of these now? There has been no straightforward solution so far (with the exception of rswitch on macOS). Disclaimer: I have been an rswitch user for a long time but always thought how great it would be if something like this existed in a universal way and would also allow installing R (patch) versions.

Hence, it was about time for a convenient and universal solution. rcli allows you to switch between any installed R version on any Unix-based system and even accounts for different architectures.

Let’s first have a look at which R versions I currently have installed

rcli list
Installed R versions:
- 3.6.3
- 4.1.0
- 4.1.0-arm64

To switch to any of these, one would do

rcli switch 3.6.3

To switch back to the current release version, one would do

rcli switch release

NB: rcli is also able to understand abbreviated versions, i.e.

rcli switch rel

would also work.

For users on macOS how are only making use of a system library and do not want to change this: rcli accounts for that and backs up the system library of any installed R version when switching or installing (to) a different version.

In case you’re wondering why switching on macOS takes a few seconds: it is somewhat costly as due the restrictions outlined in the section above that describes the CRAN macOS installer, no symlinks can be used and the different R versions need to be copied around in their entirety. Maybe this will change at some point but as of now, I do not see a different solution. If you do, please tell me so and I am happy to make things faster. This “copying things around” approach is also the reason why rcli is able to support “patch” versions in the first place, working around the “replacing the previous minor version” approach of the CRAN installer.

Installation

See the instructions in the GitHub repo.