Quickstart to using the CRC

Author

Prof. Tiffany Tang

Published

May 8, 2026

README

When in doubt, check the CRC documentation: https://docs.crc.nd.edu/index.html

Logging in

Logging into the CRC

To log in to the CRC, you will need your ND NetID and password.

  1. Open your terminal.

  2. To connect to a CRC front-end machine (there are two machines), type one of the following commands in your terminal:

    • ssh <NetID>@crcfe01.crc.nd.edu
    • ssh <NetID>@crcfe02.crc.nd.edu
  3. When prompted, enter your ND NetID password.

When you first connect, you are logged onto a shared login node.

  • DO NOT DO ANY HEAVY COMPUTATION HERE. This is a shared resource and should only be used for light tasks like editing files, submitting jobs, etc.
  • From the CRC docs: “Small processing which is not disruptive/resource intensive can be done on the front ends. This is normally pre-processing or post-processing after completion of UGE jobs.”

You can only access the CRC servers (and many other resources/docs) if you are connected to the ND network.

Transferring files

There are several ways to transfer files to and from your local machine and the CRC. For our purposes, we will primarily use scp and GitHub.

  • It’s easiest to clone your GitHub repository on the CRC and then pull/push changes as needed.
  • For all other files that aren’t tracked by Git (e.g., big data/results files), you can use scp to transfer files.
    • scp (“secure copy”) is a command-line tool for copying files securely between computers.
Cloning a GitHub repository on the CRC

To clone a GitHub repository on the CRC, type the following command in your terminal:

cd /path/to/destination
git clone <repository URL>

Examples:

  • To clone our course GitHub repository:

    cd /path/to/destination
    git clone https://github.com/tiffanymtang/dsip-s26.git
  • To clone your own course GitHub repository:

    cd /path/to/destination
    git clone https://github.com/<Your GitHub Username>/dsip.git

    Fixing Password Authentication Error

    If you run into a password authentication error when trying to clone a private GitHub repository on the CRC, you can resolve the issue by creating a personal access token (PAT), following the steps below:

    1. Go to your GitHub home page (<github.com>).
    2. Click on your icon (top right) and then “Settings”.
    3. On the left sidebar, click on “Developer Settings” > “Personal access tokens” > “Tokens (classic)”.
    4. Click on “Generate new token” > “Generate new token (classic)”.
    5. In the “Note” field, give your token a descriptive name (e.g., “crc”).
    6. Select an expiration date. (If you want, you can set the expiration to “no expiration” and use this as your general token for any time you run into a password authentication error.)
    7. Select scopes (or what this token will allow you to access/do). I would recommend selecting “repo” (and all of its subitems), “workflow”, “gist”, and “user”.
    8. Click “Generate token”. This should generate a long string of letters and numbers. This is your personal access token (PAT). Save that PAT somewhere safe. If you lose this PAT, you will have to re-generate a new one.

    You should now be able to use that PAT in lieu of your GitHub password in command line. That is, when prompted in command line to input your password, copy and paste the PAT instead of using your GitHub login password. More information about PATs can be found here

Transferring files with scp

To transfer an individual file:

  • From your local machine to the CRC:

    scp /path/to/local/file <NetID>@crcfe01.crc.nd.edu:/path/to/destination

    e.g., to copy file.txt to your CRC home directory:

    scp file.txt ttang4@crcfe01.crc.nd.edu:/users/ttang4/
  • From the CRC to your local machine:

    scp <NetID>@crcfe01.crc.nd.edu:/path/to/file /path/to/destination

    e.g., to copy file.txt to your current working directory on your local machine:

    scp ttang4@crcfe01.crc.nd.edu:/users/ttang4/file.txt .

To transfer an entire directory, use the -r flag:

  • From your local machine to the CRC:

    scp -r /path/to/local/directory <NetID>@crcfe01.crc.nd.edu:/path/to/destination
  • From the CRC to your local machine:

    scp -r <NetID>@crcfe01.crc.nd.edu:/path/to/directory /path/to/destination

Running jobs on the CRC

The CRC uses a job scheduler to manage resources and job queues. Think of a job as a script that you want to run on the CRC. What are the main steps to run a job on the CRC?

  1. Write a script (.R/.py file) that you want to run.
  2. Make sure the necessary dependencies (packages, conda, quarto, etc.) are installed.
  3. Write a job submission script.
  4. Submit the job to the CRC.
  5. Monitor the job’s progress.

Walking through each of these steps in turn next…

Write a script

You can write your script in any text editor on the CRC. Alternatively (and preferably), you can do all of your code development locally, push to your GitHub repository, and pull the changes on the CRC.

For this demonstration, we will be running a simple script that runs leave-one-out cross-validation using a random forest, applied to the TCGA breast cancer dataset. See either the scripts/parallel_example.R or scripts/parallel_example.py files in the course GitHub repository for the full script.

Install dependencies

Installing dependencies on the CRC can be done in a similar way to how you would install them on your local machine (e.g., using install.packages(...) in R or pip/conda install ... in Python). However, if we have setup a reproducible environment tool like renv in R or conda in Python, this makes our life much easier.

Since installing these dependencies can be somewhat time-consuming, let’s run this inside an interactive node (instead of the login node) to be considerate of others. To launch an interactive job, you can use the qlogin command:

qlogin
Installing dependencies
  1. Load in the R module by typing the following in your terminal:

    module load R
  2. If you haven’t already installed renv on your CRC’s R, you can do so by:

    1. Open R by typing the following in your terminal:

      R
    2. Install the renv package by typing the following in R:

      install.packages("renv")
    3. Exit R by typing q().

  3. If you have set up an renv environment, navigate to your project directory and restore the renv environment:

    1. Navigate to your desired project directory, e.g.,

      cd /path/to/parallelization
    2. Open R by typing the following in your terminal:

      R
    3. Restore the renv environment by typing the following in R:

      renv::restore()
    4. Exit R by typing q().

If the renv environment was restored successfully, we are all ready to go!

If you launched an interactive job, you can now stop the job by typing exit in your terminal.

  1. If you have used conda on the CRC before, skip to step 2. If you have never used conda on the CRC, you first need to perform some initial setup to set up conda (only need to do once).

    1. In your CRC terminal, run the following commands:

      module load conda
      conda init
      source ~/.bashrc
      module unload conda

      If conda is running properly, you should see (base) in your terminal prompt. You can also verify that conda was successfully installed by typing conda info in your terminal. It should print out information about the conda installation.

    2. Specify where you want to store your conda environments and packages (e.g., in your home directory) by running the following commands in your CRC terminal:

      conda config --add envs_dirs /users/<NetID>/.conda/envs
      conda config --add pkgs_dirs /users/<NetID>/.conda/pkgs

      Note: by default, conda may try to store your environments and packages in a location on the CRC where you do not have write permissions. This will cause errors when you try to create/restore conda environments. The above commands specify that you want to store your conda environments and packages in a location where you have write permissions.

  2. If you have already installed conda-lock, skip to step 3. If you have never used conda-lock before on the CRC, you first need to perform some initial setup to set up conda-lock (only need to do once).

    1. First, install conda-lock by typing the following in your terminal:

      pip install conda-lock
    2. To make sure that the conda-lock command can be found when you try to run it, you may need to add the directory where conda-lock is installed to your PATH environment variable. You can do this by adding a line to your .bashrc file:

      1. Open your ~/.bashrc file in a text editor (e.g., vim ~/.bashrc).

      2. Add the following line to the end of the file:

        export PATH=$PATH:~/.local/bin
      3. Save the file and exit the text editor (e.g., in vim, type :wq to save and quit).

      4. Reload your .bashrc file by running the following command in your terminal:

        source ~/.bashrc
      5. Check that the conda-lock command can now be found by typing conda-lock --version in your terminal. You should see the version number of conda-lock printed out.

  3. Load in the python module by typing the following in your terminal:

    module load python
  4. If you have set up a conda lock file for your project, navigate to your project directory and restore the conda environment from that lock file:

    1. Navigate to your desired project directory, e.g.,

      cd /path/to/parallelization
    2. Re-create the conda environment by typing the following in your terminal:

      conda-lock install --name YOUR_ENV_NAME

If the conda environment was restored successfully, you should be able to conda activate YOUR_ENV_NAME without any errors and run your desired Python scripts!

Write a job submission script

Writing the job submission script is probably the “newest” part of this process. This script essentially tells the CRC what resources you need and how to run the job.

For our purposes, we will use the generic job submission scripts provided in the course GitHub repository. More specifically, we will start by using the submit_r_job.sh and submit_python_job.sh scripts in the parallelization/job_scripts directory to run R and python scripts, respectively.

Typically, these job submission scripts will have the following structure:

  1. Specify the resources you need (e.g., number of cores).
  2. Load the necessary modules.
  3. Run the script you want to run.

Below is a generic job submission script (saved as job_scripts/submit_r_job.sh) for running an R script on the CRC.

#!/bin/bash

#$ -M netid@nd.edu   # Email address for job notification
#$ -m abe            # Send mail when job begins, ends and aborts
#$ -pe smp 24        # Specify parallel environment and legal core size
#$ -q long           # Specify queue
#$ -N job_name       # Specify job name

module load R

cd ../  # run the R script from the project root directory to activate renv
Rscript ${1}.R

Below is a generic job submission script (saved as job_scripts/submit_python_job.sh) for running a Python script on the CRC.

#!/bin/bash

#$ -M netid@nd.edu   # Email address for job notification
#$ -m abe            # Send mail when job begins, ends and aborts
#$ -pe smp 24        # Specify parallel environment and legal core size
#$ -q long           # Specify queue
#$ -N job_name       # Specify job name

module load python
conda activate dsip_parallel

cd ../
python ${1}.py

Notes:

  • Should change -M argument to your email address.
  • Should change -N argument to an informative name for your job.
  • ${1} is the first argument passed to the job submission script and serves as a placeholder for the name of the script you want to run.
    • This is so that you don’t have to write a new .sh file for every script you want to run.
  • Should change dsip_parallel to the name of your desired conda environment.
  • -pe smp XX specifies the number of cores you want to use for your job. You should change XX to the number of cores you want to use.

Helpful CRC documentation:

Submit the job to the CRC

Submitting a job

To submit your job, you will use the qsub command followed by the name of the job submission script:

qsub <job_submission_script.sh>

You can also add or overwrite the job submission options in the command line. For example,

qsub -N new_job_name -pe smp 2 <job_submission_script.sh>

would overwrite the job name to new_job_name and request 2 cores regardless of what was originally specified in <job_submission_script.sh>.

As an example,

  • To run the parallel_example.R script using the generic R job submission script submit_r_job.sh and 2 (instead of 24) cores, you would navigate to the parallelization/job_scripts directory and type the following in your terminal:

    qsub -N parallel_example -pe smp 2 submit_r_job.sh scripts/parallel_example
  • To run the parallel_example.py script using the generic Python job submission script submit_python_job.sh and 2 (instead of 24) cores, you would navigate to the parallelization/job_scripts directory and type the following in your terminal:

    qsub -N parallel_example -pe smp 2 submit_python_job.sh scripts/parallel_example

Monitor the job’s progress

To monitor the job submission status, you can use the qstat command:

qstat -u $USER

This will show all of the jobs in the queue or running that are submitted by you.

After the job has finished running, you can check the output file to see the results. By default, the output file will be named <job_name>.o<job_id>, where <job_name> is the name of the job and <job_id> is the job’s ID.

If you want to see more detailed information about the behavior of your job processes while they are running, you can use use the Xymon GUI Tool. All you need to do is click on the link to the CRC machine that your job is running on. To figure out which machine your job is running on, you can use the qstat command. Look for the column that looks something like long@d32cepyc204.crc.nd.edu

Summary

To summarize, the main steps to run a job on the CRC are:

Preliminary set up

  1. Write a script that you want to run.
    • See scripts/parallel_example.R (or .py) for examples
  2. Install the necessary dependencies.
  3. Write a job submission script.
    • See job_scripts/submit_r_job.sh (or _python_) for examples

Running the job

  1. Submit the job to the queue:

    qsub <job_submission_script.sh>
  2. Monitor the job’s progress:

    qstat -u $USER

    or check on the Xymon GUI Tool.

Most common mistakes

  1. Be sure that you are in the desired directory when you submit your job.
    • If you are not in the correct directory, the job will not be able to find the script you want to run, or the script may not be able to access the necessary data files.
  2. Don’t forget to save the job’s outputs/results to a file. If you don’t, these results will disappear once the job is finished running.
  3. Make sure you are requesting the correct number of cores.
    • If you request more cores than what your script uses, you are wasting resources.
    • If you request fewer cores than what your script uses, your job might run into weird resource errors.
  4. Double check that all of your dependencies are installed.
  5. Make sure you are submitting the job from a login node.

Before submitting a large job, it is a good idea to test your job submission script with a small test job to make sure everything is working as expected (e.g., use 2 cores instead of 24 cores).

Other helpful commands and resources

  • Managing modules:

    • To load a module, you can use the module load command followed by the module name.
    • To unload a module, you can use the module unload command followed by the module name.
    • To see a list of all available modules, you can use the module avail command.
    • To see which modules you have loaded, you can use the module list command.
    • More information about modules on the CRC can be found here.
  • If you need to delete a job from the queue, you can use the qdel command followed by the job ID:

    qdel <job_id>

    The job ID is the first column of the qstat output.

  • To check your current disk usage, you can use the quota command:

    quota

    You are initially allotted 100GB of storage space on the CRC (although it is possible to ask for more (with appropriate justification) and/or request scratch space).

  • Helpful bash commands:

    • ls: list files in the current directory
    • ls path/to/directory: list files in a specific directory
    • ls -al: list all files in the current directory (including hidden files)
    • ls -al path/to/director: list all files in a specific directory (including hidden files)
    • cd path/to/directory: change directory
    • pwd: print current working directory
    • rm path/to/file: remove a file
    • rm -r path/to/directory: remove a directory and all of its contents
    • rm -rf path/to/directory: force remove a directory and all of its contents without prompting
    • mv path/to/file path/to/new_location: move a file
    • cp path/to/file path/to/new_location: copy a file
    • man command: get help on a specific command
  • vim is a popular text editor that can be used directly in command line.

    • To open a file in vim, type vim path/to/file.
    • To edit a file in vim, type i to enter insert mode.
    • To exit insert mode, press esc.
    • To save and exit, type :wq (w = write, q = quit)
    • To exit without saving, type :q! (q = quit, ! = force)

Interactive jobs

We have already seen that you can spurn an interactive job using the qlogin command. This qlogin command will give you access to an interactive compute node. You can also request a specific number of cores using: qlogin -pe smp <num> flag (e.g., to request 2 cores, you would type qlogin -pe smp 2). Another way to request an interactive job is to use the qrsh command, e.g.,

qrsh -q long -pe smp 1

However, rather than working in terminal, it can often be more convenient to work in an interactive R or Python session using VS Code, Positron, RStudio, or Jupyter Labs. We will walkthrough how to use each of these tools on the CRC next.

Open OnDemand is a web-based portal that provides access to the CRC’s high performance computing resources and a user-friendly interface for managing jobs, files, and interactive applications.

To access Open OnDemand, go to https://ondemand.crc.nd.edu/ and log in with your ND NetID and password. Using Open OnDemand, you can launch interactive applications using VS Code, RStudio, JupyterLab, and more. You can also manage your files and submit batch jobs through the Open OnDemand interface.

More information about Open OnDemand can be found in the CRC documentation: https://docs.crc.nd.edu/resources/ood.html.

To connect to a remote server (e.g., CRC) using VS Code or Positron:

  1. Open VS Code or Positron.

  2. On the left, go to “Extensions” (the tab with the square boxes), and install the Remote - SSH extension. This might be installed by default in Positron.

  3. If you are connecting to the CRC for the first time using VS Code or Positron, you will need to add a new SSH host. This step only needs to be done once. To add a new SSH host:

    1. Click on the >< icon on the bottom left of the screen.

    2. Click on the Connect to Host... option.

    3. Click on the Add New SSH Host... (or Add host to SSH config file...) option.

    4. If you are using VS Code, type in ssh <NetID>@crcfe01.crc.nd.edu or ssh <NetID>@crcfe02.crc.nd.edu in the Host field and hit enter. If you are using Positron, Positron will open up a file where you can add the following lines to add the CRC (1 and/or 2) as a new SSH host:

      Host crcfe01.crc.nd.edu
          HostName crcfe01.crc.nd.edu
          User <NetID>
      
      Host crcfe02.crc.nd.edu
          HostName crcfe02.crc.nd.edu
          User <NetID>

      Make sure to replace <NetID> with your actual ND NetID and save the file (Cmd+S on Mac or Ctrl+S on Windows/Linux).

    5. In VS Code, you may now be prompted to select the configuration file to use. If so, select the first option (e.g., /Users/<NetID>/.ssh/config) that appears in the dropdown menu. This file will be used to store the SSH configuration.

    We have just added the CRC as a new SSH host.

  4. To connect to the CRC, we can now:

    1. Click on the >< icon on the bottom left of the screen.
    2. Click on the Connect to Host... option.
    3. Click on the crcfe01.crc.nd.edu or crcfe02.crc.nd.edu option.
    4. Enter your ND NetID password when prompted.

Everything you do in VS code will now be done on the CRC!

Best practice is to start an interactive job in your terminal before running your python code interactively in VS Code or Positron. This way, you can ensure that you are not running heavy computations on the login node.

To start an interactive job, open a new terminal in VS Code or Positron and type qrsh -q long -pe smp 1 in that terminal. Everything you do in that particular terminal will now be run in that interactive job.

CRC documentation: https://docs.crc.nd.edu/general_pages/r/rstudio.html#rstudio

This tutorial is adapted from the CRC documentation:

To set up JupyterLab on the CRC, you will need to complete the following steps:

  1. If you have used conda on the CRC before, skip to step 2. If you have never used conda on the CRC, you first need to perform some initial setup to set up conda (only need to do once). In your CRC terminal, run the following commands:

    module load conda
    conda init
    source ~/.bashrc
    module unload conda

    To verify that conda was successfully installed, type conda info in your terminal. It should print out information about the conda installation.

  2. If you already have an existing conda environment with jupyterlab and ipykernel installed, you can skip to step 3. Otherwise, follow these steps to create a new conda environment that is compatible with JupyterLab:

    module load conda
    conda create --name YOUR_ENV_NAME
    conda activate YOUR_ENV_NAME
    conda install jupyterlab
    conda install ipykernel
  3. Make your conda environment available in JupyterLab by running the following command. You only need to do this one time for each conda environment you want to use in JupyterLab:

    python -m ipykernel install --user --name=YOUR_ENV_NAME --display-name="YOUR_ENV_NAME"
  4. (Optional) If you would like to run R in JupyterLab, you can set up an R kernel by following these steps. You only need to do this once:

    1. Load the R module by typing the following in your terminal: module load R
    2. Open R by typing the following in your terminal: R
    3. Install the IRkernel package by typing the following in R: install.packages("IRkernel")
    4. Install the R kernel by typing the following in R: IRkernel::installspec()
    5. Exit R by typing q()
  5. Open a new terminal and ssh into the CRC with the -Y flag:

    ssh -Y <NetID>@crcfe01.crc.nd.edu

    The -Y flag in ssh enables trusted X11 forwarding. It works similarly to -X but bypasses some security restrictions.

  6. Access a compute node by running the following command:

    qrsh -q long -pe smp 1
  7. Inside the compute node, launch a Jupyter notebook:

    jupyter lab --no-browser --ip='0.0.0.0'

    You will see something a lot of text with part of it looking like:

    To access the server, open this file in a browser:
        file:///afs/crc.nd.edu/user/t/ttang4/.local/share/jupyter/runtime/jpserver-2636692-open.html
    Or copy and paste one of these URLs:
        http://d32cepyc193.crc.nd.edu:8888/lab?token=XXXXX
        http://127.0.0.1:8888/lab?token=XXXXX

    Note the server name and its port number (e.g., d32cepyc193.crc.nd.edu:8888). Also note the token number XXXXX (i.e., everything that comes after token=)

  8. Access the Jupyter notebook using SSH tunneling. On your local machine and in a separate terminal window, run the following command:

    ssh <NetID>@crcfe01.crc.nd.edu -L 8888:d32cepyc193.crc.nd.edu:8888 -N

    where 8888:d32cepyc193.crc.nd.edu:8888 is replaced by the <port>:<server>:<port> identifiers from step 8. Then enter your password. If there are no errors, the command line will hang. This is normal.

  9. Open a web browser on your local machine and navigate to http://localhost:8888. You should see the JupyterLab interface. Enter the token number from step 8 when prompted.

JupyterLab should now be open and ready to use!

Job Arrays

So far, we have learned how to parallelize tasks (e.g., for loops) within your R/Python script using the future package in R or joblib in Python. In addition to parallelizing tasks within a script, you can also parallelize tasks by submitting multiple jobs to the CRC using job arrays. At a high level, a job array is a collection of jobs that are submitted to the CRC as a single batch.

As an example, we previously submitted one job to run leave-one-out cross-validation for the random forest model. Now suppose that we also wanted to run a second job to run leave-one-out cross-validation for a different model (e.g., k nearest neighbors). While we could submit two separate jobs to the CRC, it is also possible to submit a single job array with two jobs: one job running leave-one-out CV for the random forest model and the other job running leave-one-out CV for the k nearest neighbors model.

To demonstrate how to submit a simple job array, let’s implement the aforementioned example, where we want to submit a job array with two jobs:

  • Job 1: Leave-one-out cross-validation for the random forest model
  • Job 2: Leave-one-out cross-validation for the k nearest neighbors model

and each job will use C > 1 cores to parallelize the for loop computation.

  • The main R/Python scripts that we will be using are scripts/parallel_example_with_args.R (or .py).
    • Note that these scripts are slightly modified versions of the original parallel_example.R (or .py) scripts. The main difference is that we now allow these scripts to accept a command line argument --array_id (or --model), which takes in an integer (or character string), indicating whether to use an "rf" (if array_id=1) or "knn" (if array_id=2) model.
  • The main job submission scripts that we will be using are job_scripts/submit_r_job_array.sh (or submit_python_job_array.sh).
    • The only addition to these scripts is the -t flag, which specifies the range of the job array. Setting -t 1-2 will submit two jobs to the CRC — one job with ${SGE_TASK_ID}=1 and a second job with ${SGE_TASK_ID}=2. (If we wanted to submit 10 jobs (indexed from 1 to 10), we would set -t 1-10.)
    • Note: the job array ID is stored in the environment variable SGE_TASK_ID.

To finally submit the job array, we can run the following command in the terminal:

# for R users:
qsub -N parallel_array_example_r -pe smp 2 submit_r_job_array.sh scripts/parallel_example_with_args
# for Python users:
qsub -N parallel_array_example_py -pe smp 2 submit_python_job_array.sh scripts/parallel_example_with_args

or

# for R users:
qsub -N parallel_arrayname_example_r -pe smp 2 submit_r_job_arrayname.sh scripts/parallel_example_with_args
# for Python users:
qsub -N parallel_arrayname_example_py -pe smp 2 submit_python_job_arrayname.sh scripts/parallel_example_with_args

Note: the -pe smp 2 flag is not necessary and is only added to overwrite the original job submission script’s request for 24 cores since this is purely for demonstration.

If your tasks do not map directly to a simple integer range, you can also use the -t flag to specify a list of tasks to run. For example, -t 1,3,5,7,9 would run the job with SGE_TASK_ID=1, SGE_TASK_ID=3, SGE_TASK_ID=5, SGE_TASK_ID=7, and SGE_TASK_ID=9.

Why use job arrays?

  • The main advantage of using job arrays is that it allows you to submit multiple jobs at once, essentially parallelizing across possibly many different machines.
    • Due to how the job scheduler works, your jobs will receive higher priority if you submit it as a single job array than multiple separate jobs.
  • From the CRC docs: “If you find that you need to frequently submit 50 or more different jobs, we request that you implement those tasks within a job array. Grid engine is able to handle arrays much more efficiently than tens or hundreds of individual scripts from a single user. Fewer individual tasks reduces load on the job scheduler and improves overall performance.”

Combining job arrays with parallelization within a script can be a very powerful way to speed up computations. This essentially gives you the ability to parallelize your code in two different “axes” or to do some type of nested parallelization strategy without having to modify your code too much.

CRC documentation on job arrays: https://docs.crc.nd.edu/new_user/quick_start.html#job-arrays

Job Dependencies

Sometimes, you may have a set of jobs that depend on each other. For example, you may have a job array that fits various models, and you want to run a final job that aggregates the results from all of the models. In this case, you would want to make sure that the final job only runs after all of the model-fitting jobs have completed.

To specify these job dependencies, you can use the -hold_jid flag, followed by the job ID to wait for, when submitting a job to the CRC. The driver_r.sh and driver_python.sh scripts show how to do this. To run the driver script, you can type in terminal: sh driver_r.sh or sh driver_python.sh.

FAQs

If pandoc is not found but is needed (e.g., for rendering R Markdown files), you can try following these steps (note: this has not been tested in a while):

  1. Launch RStudio and find pandoc location used for RStudio via: Sys.getenv("RSTUDIO_PANDOC")

  2. Add the pandoc location to your PATH by adding the following line to your ~/.bashrc file:

    export RSTUDIO_PANDOC=/path/to/pandoc

    where /path/to/pandoc is whatever was returned by Sys.getenv("RSTUDIO_PANDOC").

  3. Source your ~/.bashrc file:

    source ~/.bashrc
  4. Open R (not RStudio) and check that R Markdown can find pandoc via:

    rmarkdown::find_pandoc()

    If it returns a non-empty string, you should be good to go!

This guide is adapted from the Quarto documentation: https://quarto.org/docs/download/tarball.html

In order to install quarto on the CRC, you can follow these steps:

  1. Change directories to where you want to install quarto, e.g.,

    mkdir -p ~/Software
    cd ~/Software
  2. Download the latest Quarto tarball from the Quarto website by typing the following in your CRC terminal:

    wget https://github.com/quarto-dev/quarto-cli/releases/download/v1.6.42/quarto-1.6.42-linux-amd64.tar.gz
  3. Unpack the tarball by typing the following in your CRC terminal:

    tar -xvzf quarto-1.6.42-linux-amd64.tar.gz

    You can now delete the tarball by typing the following in your CRC terminal:

    rm quarto-1.6.42-linux-amd64.tar.gz
  4. Create a symbolic link (symlink) to the quarto executable by typing the following in your CRC terminal:

    mkdir -p ~/.local/bin
    ln -s ~/Software/quarto-1.6.42/bin/quarto ~/.local/bin/quarto

    This will create a symlink to the quarto executable in your ~/.local/bin directory, which points to the actual quarto executable in the quarto-1.6.42 directory.

  5. Check whether or not the quarto installation is findable by typing the following in your CRC terminal:

    quarto --version

    If the installation was successful, you should see the version of quarto that you installed. Otherwise, you will see an error message, most likely saying that the command quarto was not found. If this is the case, we need to add the ~/.local/bin directory to your PATH environment variable. To do this:

    1. Open your ~/.bashrc file, e.g., by typing the following in your CRC terminal:

      vim ~/.bashrc
    2. Add the following line to the end of the file:

      export PATH=$PATH:~/.local/bin

      [If you are using vim, press i to enter insert mode, scroll down to the bottom of the file, type the line, and then press esc followed by :wq to save and exit.]

    3. Source your ~/.bashrc file by typing the following in your CRC terminal:

      source ~/.bashrc
    4. Check whether or not the quarto installation is findable by typing the following in your CRC terminal:

      quarto --version

      If the installation was successful, you should see the version of quarto that you installed.

You should now be able to render Quarto documents on the CRC!