- Home
- About
- Research
- Education
- News
- Publications
- Guides-new
- Guides
- Introduction to HPC clusters
- UNIX Introduction
- Nova
- HPC Class
- SCSLab
- File Transfers
- Cloud Back-up with Rclone
- Globus Connect
- Sample Job Scripts
- Containers
- Using DDT Parallel Debugger, MAP profiler and Performance Reports
- Using Matlab Parallel Server
- JupyterLab
- JupyterHub
- Using ANSYS RSM
- Nova OnDemand
- Python
- Using Julia
- LAS Machine Learning Container
- Support & Contacts
- Systems & Equipment
- FAQ: Frequently Asked Questions
- Contact Us
- Cluster Access Request
RStudio
Introduction
RStudio is an integrated development environment for the R programming language, with limited support for other programming languages (including Python, bash, and SQL). RStudio provides a powerful graphical environment for importing data in a number of formats (including CSV, Excel spreadsheets, SAS, and SPSS); manipulating, analyzing, and visualizing data; version control with git or SVN; a graphical R package manager that provides point/click search/installation/uninstallation of R packages from its substantial ecosystem (including the Bioconductor repository, which provides almost 1500 software tools “for the analysis and comprehension of high-throughput genomic data.”); and many other features.
RStudio Server is a client/server version of RStudio that runs on a remote server and is accessed via the client’s web browser. A graphical file manager allows file upload/download from hpc-class via web browser.
IMPORTANT: This guide was made using the hpc-class cluster. If you're using a different cluster, replace hpc-class with your correct cluster. i.e. if you're on the condo cluster do: hpc-class.its.iastate.edu ---> condo2017.its.iastate.edu. RStudio container is not currently available on the Nova cluster.
ISU Options for RStudio
ISU users can use RStudio in one of the following ways:
- Preferred: Access RStudio via Open OnDemand
- To run RStudio and access data on your local workstation, download the open source RStudio Desktop.
- To run RStudio Server on and access data in hpc-class, follow the directions in this guide.
RStudio Server on hpc-class
RStudio Server is currently available on hpc-class using a Docker image (imported into Singularity) provided by the Rocker project. The provided geospatial image provides not only geospatial libraries, but also LaTeX / publishing libraries, and Tidyverse data science libraries. Other R packages can be easily installed into your home directory from within RStudio.
Running RStudio Server on hpc-class allows ISU users to access any data on hpc-class that they can access from the command line (SSH). To use RStudio Server on hpc-class, a user submits a SLURM job script. This allows RStudio Server to run on any available hpc-class compute resources (including large-memory nodes). A default job script that should suffice for most users is provided.
After a user is done using RStudio Server, they should save their work in RStudio, and then stop RStudio Server by cancelling the job with the slurm scancel command.
A few notes:
- RStudio terminal (bash command shell): since RStudio Server is running in a container with a Debian base image, you won’t be able to access software environment modules (e.g., that you would normally see when logging into hpc-class and issuing the module list command), as those are installed on the (CentOS) host.
- Data access: your home directory is mounted inside the RStudio Server container, and HPC Group has configured Singularity to mount the /project directory. $TMPDIR (which on a compute node is per-job local scratch on the compute node’s direct attached storage that gets deleted at the end of SLURM job) is mounted inside the container at /tmp. If you have any questions you can email hpc-help@iastate.edu.
- Software installation: The provided SLURM job script creates a ~/.Renviron file in your home directory that allows RStudio to install additional R packages into your home directory (the container image is immutable). Installing a lot of R libraries may contribute to the default 10G soft limit quota on your home directory being surpassed.
Starting RStudio Server
- Log into hpc-class via SSH (see the Quick Start Guide for instructions).
-
Submit the RStudio SLURM job script with the following command:
> sbatch /shared/hpc/containers/Rstudio/4.0.0/rstudio.job(Optional) By default, this SLURM job is limited to a 4 hour time limit, 1 processor core, and 6600 MB memory. To customize, see the section Requesting Additional Compute Resources below. -
After the job has started, view the “$HOME/rstudio-JOBID.out” file for login information (where JOBID is the SLURM job ID reported by the sbatch command).
> module load singularity > sbatch /shared/hpc/containers/Rstudio/4.0.0/rstudio.job
Submitted batch job 214664
> cat ~/rstudio-214664.out
... -
Point your web browser to the listed hostname / port, then enter your ISU user name and
the temporary password (valid only for this job only; in this example 9BtKF4VRuOO+BsvpDjav)
Stopping RStudio Server
-
Click the Quit Session (“power”) button in the top-right corner of the RStudio window (see picture below), or select "File > Quit Session..."
-
After the “R Session has Ended” window appears, cancel the SLURM job from the hpc-class command line. E.g., if the job ID is 214664:
> scancel -f 214664Be sure to specify the scancel -f / --full option as demonstrated above.
- (If using SSH Port Forwarding instead of VPN) Close the terminal / PuTTY window in which the SSH tunnel was established.
macOS / Linux / Windows Users
-
Open a new macOS/Linux terminal window or a new Windows PowerShell/Command Prompt window and enter the SSH command listed in the job script output file. In this example:
> ssh -N -L 8787:hpc-class14:51530 jane.user@hpc-class.its.iastate.eduThere will be no output after logging in. Keep the window / SSH tunnel open for the duration of the RStudio session. - Point your browser to http://localhost:8787. Enter your ISU user name, and one-time password listed in the job script output file.
PuTTY Users
NOTE: It's recommended to use a terminal to connect but you are still able to use PuTTY if you prefer
The following silent video is a media alternative for the text in steps 1-4 below: rstudio-from-putty-port-forward.mp4
- Open a new PuTTY window
- In Session > Host Name, enter: hpc-class.its.iastate.edu
-
In the category: Connection > SSH > Tunnels, enter 8787 in Source Port, the Destination hostname:port
listed in the job script output, click “Add”, then
click “Open”.
-
Point your browser to http://localhost:8787 (same as the images above). Enter your ISU user name, and one-time password listed in the job script output file.
Chrome browser users
Video showing how to ssh to hpc-class using the Chrome Secure Shell App: chrome-ssh.mp4
Requesting Additional Compute Resources
The default job resources (4 hour time limit, 1 processor core, 6600 MB memory) may be customized by:
-
sbatch command-line options, e.g., to specify an 8-hour wall time limit, 16 G memory, and 2 processor cores (= 4 hardware threads):
sbatch --time=08:00:00 --mem=16G --cpus-per-task=4 /shared/hpc/containers/Rstudio/4.0.0/rstudio.job
- Copying the job script to a directory one has write access to and modifying the appropriate SLURM #SBATCH directives.