R
From "What is R?" in the R FAQ:
- R is a system for statistical computation and graphics. It consists of a language plus a run-time environment with graphics, a debugger, access to certain system functions, and the ability to run programs stored in script files.
Installation
Install the r package. The installation of external packages within the R environment may require gcc-fortran.
Usage
To start an R session, open your terminal and type this command:
$ R
- Make sure to use a capital "R" for the command. Note that some shells use the lowercase
r
command to repeat the last entered command. Once in your R session, the prompt will change to>
- site refers to system-wide in R Documentation
Run ?Startup
to read the documentation about system file configuration, help()
for the on-line help, help.start()
for the HTML browser interface to help, demo()
for some demos and q()
to close the session and quit.
When closing the session, you will be prompted as follows:
Save workspace image? [y/n/c]:
The workspace is your current working environment and include any user-defined objects, functions. The saved image is stored in .RData
format and will be automatically reloaded the next time R
is started. You can manually save the workspace at any time in the session with the save.image(image.RData)
command, save as many images as you want (eg: image1.RData, image2.RData). You can load image with the load.image(image.RData)
command at any time of your session.
- The
--quiet
option can be used to start R without a verbose startup message. Addalias R="R --quiet"
to a startup file to use this behaviour by default. - Running R from the command line will set R's working directory to the current directory. Opening the R GUI will set R's working directory to $HOME, unless explicitly defined in your configuration files (
.Renviron
or.Rprofile
).
Configuration
Whenever R starts, its configuration is controlled by several files. Please refer to Initialization at Start of an R Session to get a detailed understanding of startup process.
Environment
R first loads site and user environment variable files. The name of the site file is controlled by the Environment variables R_ENVIRON
if it exists, and defaults to /etc/R/Renviron
. The name of the user file is specified by R_ENVIRON_USER
. If that is unset, it defaults to .Renviron
in the current working directory if it exists, or ~/.Renviron
otherwise.
The most important variables can be found on Environment Variables R Documentation.
You may disable loading environment files with --no-environ
Lines in the Renviron
file should be either comment lines starting with #
or lines of the form name=value
. Here is a very basic .Renviron
:
.Renviron
R_HOME_USER = /path/to/your/r/directory R_PROFILE_USER = ${HOME}/.config/r/.Rprofile R_LIBS_USER = /path/to/your/r/library R_HISTFILE = /path/to/your/filename.Rhistory # Do not forget to append the .Rhistory MYSQL_HOME = /var/lib/mysql
Alternatively, environmental variables may be set from within your R session via the Sys.setenv()
function. For instance, to set the time zone (TZ
) environmental variable to "Europe/London"
:
> Sys.setenv(TZ="Europe/London")
R_MAX_VSIZE
) environmental variable in your .Renviron
file to enable R to use more of your system's total (physical + virtual) available memory. The value set may either be a numerical value providing the number of bytes, e.g. 16000000000
, or a numerical value suffixed by ISO units, e.g. 16GB
.
Profile
R then loads a .Rprofile
file, which contains R code that is executed. These files are read in the following order of preference (only one file is loaded):
- A file specified by the environment variable
R_PROFILE_USER
. - A
.Rprofile
file in the current working directory. -
$HOME/.Rprofile
.
A .Rprofile
file can contain arbitrary R code, though best practice suggests that one should not load packages at startup, or execute any code that would hinder package upgrades and reproducibility.
~/.Rprofile
# The .First function is called after everything else in .Rprofile is executed .First <- function() { # Print a welcome message message("Welcome back ", Sys.getenv("USER"),"!\n","working directory is:", getwd()) } options(digits = 12) # Number of digits to print. Default is 7, max is 15 options(scipen = 2) # Penalty applied to inhibit the use of scientific notation options(show.signif.stars = FALSE) # Do not show stars indicating statistical significance in model outputs local({ n <- max(parallel::detectCores() - 2L, 1L) # Detect the number of cores available for use in parallelisation options(Ncpus = n) # Parallel package installation in install.packages() options(mc.cores = n) # Parallel apply-type functions via 'parallel' package }) error <- quote(dump.frames("${R_HOME_USER}/testdump", TRUE)) # Post-mortem debugging facilities
You can add more global options to customize your R environment. See this post for more examples of user configurations.
Locale
Aspects of the Locale are accessed by the functions Sys.getlocale
and Sys.localeconv
within the R session. Locales will be the one defined in your system.
Managing R packages
There are many add-on R packages which can be browsed on The R Website.
With pacman
There are some packages available on the AUR with the prefix r-. You can mix and match installing R packages with pacman and through R (below), but if you do so, you should let pacman manage system packages (those that reside at /usr/lib/R/library
) and let R manage user-installed packages elsewhere (e.g. ~/R/library
).
The desolve repository also provides a set of pre-built R packages. For requests, please see the repository's Git repository.
With R
Packages can be installed from within R using the install.packages(c("pkgname"))
command. You should use a local library and let pacman manage files that reside under /usr/lib/R/library
.
-
install.packages()
requires tk to be installed for selecting mirrors. Try installing this package if you see:
Error: .onLoad failed in loadNamespace() for 'tcltk', details (...)
- Alternatively, you can disable graphical pop-ups like this by running:
> options(menu.graphics=FALSE)
- To make this change permanent, add the above command to your Rprofile.
Within your R session, run this command to check that your user library exists and is set correctly:
> Sys.getenv("R_LIBS_USER")
[1] "/path/to/directory/R/packages"
Alternatively, you may install from the command line like so:
$ R CMD INSTALL -l $R_LIBS_USER pkg1 pkg2 ...
Upgrading R packages
Within an R session
> update.packages(ask=FALSE)
Or when you also need to rebuild packages which were built for an older version:
> update.packages(ask=FALSE, checkBuilt=TRUE)
Or when you also need to select a specific mirror (https://cran.r-project.org/mirrors.html) to download the packages from (changing the URL as needed):
> update.packages(ask=FALSE, checkBuilt=TRUE, repos="https://cran.ma.imperial.ac.uk/")
Within a shell
You can use Rscript
, which comes with r to update packages from a shell:
$ Rscript -e "update.packages()"
Makevars
The Makevars file can be used to set the default make options when installing packages. An example optimized Makevars file is as follow:
~/.R/Makevars
CFLAGS=-O3 -Wall -pedantic -march=native -mtune=native -pipe CXXFLAGS=-O3 -Wall -pedantic -march=native -mtune=native -pipe
Alternative shells
As an alternative to the default R program, the following shell is also available:
- radian — An alternative console for the R program with multiline editing and rich syntax highlight.
Adding a graphical frontend to R
R does not include a point-and-click graphical user interface for statistics or data manipulation. However, third-party user interfaces for R are available, such as R Commander and Rattle.
R Commander frontend
R Commander is a popular user interface to R. There is no Arch Linux package available to install R Commander, but it is an R package so it can be installed easily from within R. R Commander requires tk to be installed.
To install R Commander, run R
from the command line. Then type:
> install.packages("Rcmdr", dependencies=TRUE)
This can take some time.
You can then start R Commander from within R using the library command:
> library("Rcmdr")
Rattle frontend
Rattle is a popular user interface to R with focus on data mining. There is no Arch Linux package available but it can be installed easily from within R. The GUI depends on atk, cairo, pango, and gtk2.
To install Rattle, run R
from the command line. Then type:
> install.packages("rattle", dependencies=TRUE)
This can take some time.
You can then start Rattle from within R using the library command:
> library("rattle") > rattle()
JASP
jasp-desktop provides a menu-driven interface for common statistical analysis using R as the backend. A Flatpak package is also available.
jamovi
jamovi-gitAUR provides a menu-driven intervace for common statistical analysis using R as the backend. A Flatpak package is also available.
Editors IDEs and notebooks with R support
RKWard IDE
RKWard is an IDE developed by KDE, which allows for data import and browsing as well as running common statistical tests and plots. You can install rkward from the official repositories.
RStudio IDE
RStudio an open-source R IDE. It includes many modern conveniences such as parentheses matching, tab-completion, tool-tip help popups, and a spreadsheet-like data viewer.
Install rstudio-desktop-binAUR or rstudio-desktop-gitAUR, also available in the rstudio repository.
The R library path is often configured with the R_LIBS
environment variable. RStudio ignores this, so the user must set R_LIBS_USER
in ~/.Renviron
, as documented above.
RStudio uses a four-pane layout by default. However, if only the taskbar and toolbar located at the vertical top of an otherwise blank screen are visible, create with elevated privileges the following file and populate it with contents as shown below:
/usr/lib/qt/libexec/qt.conf
[Paths] Prefix = /usr/lib/qt Data = /usr/share/qt Translations = /usr/share/qt/translations
Restart RStudio and observe the expected split-screen layout with four panes. See RStudio does not show any pane on Stack Overflow and https://github.com/rstudio/rstudio/issues/5961 for more information.
RStudio server
RStudio Server enables you to provide a browser based interface to a version of R running on a remote Linux server.
Install rstudio-server-gitAUR. The two main configuration files are /etc/rstudio/rserver.conf
and /etc/rstudio/rsession.conf
. They are not created during the install, so you will need to create and edit them. For information about configure options, please refer to RStudio getting started documentation.
To start the server, please enable and start the rstudio-server.service
unit file provided with the package.
Emacs Speaks Statistics
Emacs users can interact with R via the emacs-essAUR package.
Nvim-R
The nvim-rAUR package allows vim and neovim users to code in R, including editing and rendering of R markdown (.Rmd) files, execution of R code in a separate pane, inspection of variables, and integrated help panes.
Cantor
cantor is a notebook application developed by KDE that includes support for R.
Code
The code editor has plugin support for R.
Jupyter notebook
jupyter-notebook is a browser based notebook with support for many programming languages. R support can be added by installing the IRkernel.
Tips and tricks
Optimized packages
The numerical libraries that come with the r package (lapack and consequently blas) do not have multithreading capabilities. Replacing the reference blas package with an optimized BLAS can produce dramatic speed increases for many common computations in R. See these threads for an overview of the potential speed increases:
- https://github.com/tmolteno/necpp/issues/18
- http://blog.nguyenvq.com/blog/2014/11/10/optimized-r-and-python-standard-blas-vs-atlas-vs-openblas-vs-mkl/
- https://freddie.witherden.org/pages/blas-gemm-bench/
- http://nghiaho.com/?p=1726
OpenBLAS
openblas can replace the reference blas. If you are using the regular r package from the extra repository, no further configuration is needed; R is configured to use the system BLAS and will use OpenBLAS once it is installed.
Intel MKL
If your processors are Intel, you can use the Intel math Kernel Library. The MKL, beyond the capabilities of multithreading, also has specific optimizations for Intel processors. Keep in mind that they can potentially interfere with the standard R functionality for parallel processing.
Please first install the intel-mkl, then the r-mklAUR package.
- If you install the r-mkl package with R already installed, you will be prompted to remove R. Once r-mkl is installed, please run on R console the following command:
> update.packages(checkBuilt=TRUE)
- Here are elapsed time in seconds from computing 15 tests with default GCC build and icc/MKL build: 274.93 sec for GCC build, 21.01 sec for icc/MKL build. See this post for more information.
Intel Advisor
Intel Advisor delivers top application performance with C, C++ and Fortran compilers, libraries and analysis tools.
Install the intel-advisorAUR package.
Set CRAN mirror across R sessions
Instead of having R ask which CRAN mirror to use every time you install or update a package, you can set the mirror in the .Rprofile
file. https://cloud.r-project.org/ should be a good default for everywhere as it redirects to your closest CRAN mirror:
~/.Rprofile
## Set CRAN Mirror: local({ r <- getOption("repos") r["CRAN"] <- "https://cloud.r-project.org/" options(repos = r) })
Inhibit "Save workspace image?" prompt
Upon executing q()
in R to exit, you will typically be greeted by the following prompt:
> q()
Save workspace image? [y/n/c]:
On face value, this may seem convenient, but using workspace images will render your code less portable. The "Save workspace image?" prompt may be disabled by creating a hidden environment (.env
), adding a new version of the q()
function to it in which the default value for the save
argument has been altered to "no"
, then attaching the hidden environment. This will mask the q()
function of R's base package, effectively switching off the prompt. To make this change permanent, add the following code to your .Rprofile
file:
~/.Rprofile
## Create hidden environment .env <- new.env() ## Define new q() function .env$q <- function(save = "no", ...) { quit(save = save, ...) } ## Attach hidden environment attach(.env, warn.conflicts = FALSE)
Running R from a shell
Run the following command to execute R code from a command-line shell:
$ R CMD BATCH script.R
This command will return a .Rout file with results from script.R
. The .Rout file will always contain a proc.time()
call at the end as a benchmark. sessionInfo()
can be added to the end of the R code to keep a record of packages and versions.
Troubleshooting
Unable to load stringi.so
The following error may be encountered when running R code that depends on the stringi CRAN package:
unable to load shared object 'R_LIBS_USER/stringi/libs/stringi.so': libicui18n.so.MAJOR: cannot open shared object file: No such file or directory
This often occurs following a soname bump to the library (provided by icu). stringi
will need to be rebuilt in R by installing the package again.
See also
- Official website
- RSeek A Google Custom Search Engine for R related material.
- R for Data Science Online version of a CCA licensed book written by Garrett Grolemund and Hadley Wickham from RStudio, 2017.
- R-bloggers Aggregation site for (English) blogs related to R.
- /r/Rlanguage on Reddit There are several R related Subreddits, each one provides links to the others.