We use a number of powerful computers within CEOAS and OSU for modeling, processing, analysis and teaching. Sometimes we use national supercomputing facilities like Derecho. This page explains how to use them.
These computers are a shared resource. Do not use more computational resources (RAM, CPUs) than you need or are available. For example, if you are using the CEOAS JupyterHub, be careful loading large datasets as this can quickly eat into the RAM available to other users. It is good practice to load data in chunks at a time, which requires that you learn to write efficient code.
HPC clusters like CQLS, or Derecho, have a login node and various types of compute nodes. The login node, as the name suggests, is the place you (and everyone else) log on to (usually via ssh). Do not run computationally intensive tasks, or anything that requires significant memory, on the login nodes. Instead, request time on dedicated compute nodes using job scheduling software (e.g. SLURM). If you're just testing things out and don't know how much time or resources you might need, you can start an interactive job for this purpose.
On some HPC clusters, the compute nodes do not have internet access, so interacting with github and downloading files has to be done via the login nodes. This is usually ok because these tasks are not typically resource intensive.
Data in your home directory is often backed up, so put code, analysis notebooks (including git repositories) and configuration files here. Unfortunately, your home directory may be space limited and unable to accommodate large datasets. It is usually necessary to create symbolic links from your home directory to large datasets, which will likely reside in a special workspace (might be backed up) or on scratch storage (not backed up). Ideally, everything needed to reproduce your work should be backed up and everything not backed up should be easily reproducible.
Logging into the HPC is easiest with ssh keys. Each cluster usually has its own set of instructions for setting up ssh key access and it is worth following them closely. If there are no specific instructions, these general ones tend to work.
You must be on the OSU network to access these computers. This means that you must either physically be at OSU connected to the campus wifi or ethernet, or you may be remote and connected to the university VPN.
Go here to configure the OSU VPN on your personal computer.
Apply for an HPC account with CQLS if you don't have one already.
Access the hub at using your ONID credentials: https://jupyter-hpc.ceoas.oregonstate.edu
Select a number of CPUs and RAM appropriate for your needs. Note that you cannot ssh into this server.
This server has 52 CPUs and about 500GB of RAM (as of August 2025, although an upgrade is underway). It can be used for data analysis and visualization, but should not be used for running intensive numerical simulations.
Request an account from Thomas Olson. Log in using your ONID credentials. You can also log in using the terminal on you computer via ssh, e.g.
ssh <YOUR_ONID_USERNAME>@jupyter-research.ceoas.oregonstate.edu
Confirm that our group storage folder ocg/ is available in your home directory. If it is not, then email Thomas and Jesse to get this configured.
Default conda environments for python 3, Julia and R come preinstalled on the Hub. These environents are managed by Thomas and do not contain all the packages that we need for our work. Rather than constantly bothering Thomas to install and update the environments, we can manage our own conda environments after a little setup.
If you want a prettier terminal you can install Oh My Bash. It is recommended to configure git and link an ssh-key to your GitHub account.
Log into JupyterHub, open a terminal, and follow the instructions to download miniforge and start the installation. During the installation you will be asked some questions, to which you can respond yes, e.g.
Do you accept the license terms? [yes|no]
>>> yes
Miniforge3 will now be installed into this location:
/home/<YOUR_HOME_FOLDER>/miniforge3
- Press ENTER to confirm the location
- Press CTRL-C to abort the installation
- Or specify a different location below
>>> <HIT_ENTER>
# Lots of text...
Do you wish to update your shell profile to automatically initialize conda?
# More text...
>>> yes
For the changes to take effect you need to run source ~/.bashrc or restart your shell.
Installing miniforge gives you immediate access to the base environment. You can now install useful tools into your base environment, such as rclone (e.g. mamba install rclone), and create new environments.
Add the bash functions to install and remove conda environments to your .bashrc file. This can be done with the command line text editor nano by typing nano .bashrc in the terminal (see the cheatsheet for help; pasting should work). If you are brave/familiar you can use vim instead.
For the changes to take effect you need to run source ~/.bashrc or restart your shell.
At this point we have installed miniconda and some convenience functions. Now we can install a new conda environment using our convenience function. All of our group's conda environments are stored in conda_environments, with a copy also residing on the storage. The first environment to install is our 'everything but the kitchen sink' package multitool.
cd ~/ocg/conda_environments
create_env multitool.yml
This might take a few seconds. After it has finished, try refreshing your JupyterHub page and accessing the launcher. You should see a new icon for the multitool environment.
The next package to install is ARRR, our very own R environment managed using conda. In this case we do no need to install the ipython kernel and will take a slightly different approach.
mamba env create -f arrr.yml
mamba activate arrr
R
The last command R start an R console. In the console type:
IRkernel::installspec(name = 'arrr', displayname = 'ARRR')
q()
Where the last line quits the R shell (Enter n when prompted to save the session). Now, after refreshing JupyterHub, you should have access to a new R environment called ARRR.
We use a low-powered virtual machine (VM) running Rocky Linux to perform various adminstrative tasks including scheduled cloud backups using kopia and running this wiki. Configuration of the VM is documented on GitHub.
The CQLS manages a high performance computing (HPC) cluster with numerous machines accessible via job submission system SLURM. See their documentation for more information (beware that this cluster is evolving rapidly and documentation may be outdated).
Request an account here.
Access via ssh, e.g.
ssh <YOUR_ONID_USERNAME>@hpc.cqls.oregonstate.edu
Set up ssh keys to avoid password entry.