Skip to content

Commit

Permalink
Pull Request to Update Farmshare Pathway to create Jupyter Notebooks (#…
Browse files Browse the repository at this point in the history
…39)

* Adding better support for farmshare
* Updated setup.sh to delete params.sh file everytime setup is run
* Added .singularity option back to singularity-jupyter and singularity-jupyterlab sbatch scripts.
* Updated Default Notebook and checks environment variable
* Updated setup.sh file for different machine prefixes and added a new prompt for machine prefixeS
* Updated Setup file to address Lsof and ssh tunneling
* Updated resume.sh for to work in general sense
* Updated to generalise all scripts except for ssh tunneling
* Keep Resume.sh as hardcoded due to tunneling problems
* Added a Boolean Operator for the ssh tunneling
* Updated Resume.sh
* Updated README.md to include the new subsection SSH Port forwarding considerations in the Setup Section
* Formatting Changes for the README port forwarding subsection
* Deleting Untested original scripts from the sbatches in farmshare
* Update setup.sh to remove extraneous variable of SHERLOCK
* Modification to use container maintained by Soham

Co-authored-by: Vanessasaurus <[email protected]>
  • Loading branch information
sohams-MASS and vsoch authored Jul 6, 2021
1 parent c7e728c commit eb91534
Show file tree
Hide file tree
Showing 11 changed files with 149 additions and 66 deletions.
29 changes: 27 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# forward

## What is this?

Forward sets up an sbatch script on your cluster resource and port forwards it back to your local machine!
Useful for jupyter notebook and tensorboard, amongst other things.

Expand All @@ -23,7 +22,9 @@ are many possible use cases!

## Setup
For interested users, a few tutorials are provided on the [Research Computing Lessons](https://vsoch.github.io/lessons) site.
Brief instructions are also documented in this README.
Brief instructions are also documented in this README.

For farmshare - please navigate to the README located in sbatches/farmshare/README.md.

### Clone the Repository
Clone this repository to your local machine.
Expand Down Expand Up @@ -78,6 +79,30 @@ One downside is that you will be foregoing sherlock's load
balancing since you need to be connecting to the same login machine at each
step.

### SSH Port Forwarding Considerations

Depending on your cluster, you will need to identify whether the compute nodes (not the login nodes) are isolated from the outside world or not (i.e can be ssh'd into directly). For Sherlock, they are isolated. For FarmShare they are not. This is important when we are setting up the ssh command to port forward from the local machine to the compute node.

For HPC's where the compute node is isolated from the outside world (as is the case with sherlock), the ssh command basically establishes a tunnel to the login node, and then from the login node establishes another tunnel to the compute node.
In this case we write a command where we port forward to the login node, and then the compute node, which is accessible from the login node. The entire command might look like this:

```bash
$ ssh -L $PORT:localhost:$PORT ${RESOURCE} ssh -L $PORT:localhost:$PORT -N "$MACHINE" &
```

In the command above, the first half is executed on the local machine `ssh -L $PORT:localhost:$PORT ${RESOURCE}`, which establishes a port forwarding to the login node. The next line `ssh -L $PORT:localhost:$PORT -N "$MACHINE" &` is run from the login node, and port forwards it to the compute node, since you can only access the compute node from the login nodes.


For HPC's where the compute node is not isolated from the outside world (as is the case with Farmshare) the ssh command for port forwarding first establishes a connection the login node, but then continues to pass on the login credentials to the compute node to establish a tunnel between the localhost and the port on the compute node.
The ssh command in this case utilizes the flag `-K` which forwards the login credentials to the compute node:
```bash
$ ssh "$DOMAINNAME" -l $FORWARD_USERNAME -K -L $PORT:$MACHINE:$PORT -N &
```
The drawback of this method is that when the start.sh script is run, you will have to authenticate twice (once at the beginning to check if a job is running on the HPC, and when the port forwarding is setup). This is the case for FarmShare.

In the setup.sh file, we have added an option `$ISOLATECOMPUTENODE`, which is a boolean operator. For users of FarmShare, and Sherlock, this value is set automatically. For your own default cluster, you will be prompted whether the compute node is isolated or not, please write true or false (case sensitive) for your resource depending on its properties. You may have to consult the documentation or ask the HPC manager.


# Notebooks

Notebooks have associated sbatch scripts that are intended to start a jupyter (or similar)
Expand Down
2 changes: 1 addition & 1 deletion end.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,4 +23,4 @@ echo "Killing $NAME slurm job on ${RESOURCE}"
ssh ${RESOURCE} "squeue --name=$NAME --user=$FORWARD_USERNAME -o '%A' -h | xargs --no-run-if-empty /usr/bin/scancel"

echo "Killing listeners on ${RESOURCE}"
ssh ${RESOURCE} "/usr/sbin/lsof -i :$PORT -t | xargs --no-run-if-empty kill"
ssh ${RESOURCE} "${USE_LSOF} -i :$PORT -t | xargs --no-run-if-empty kill"
15 changes: 10 additions & 5 deletions helpers.sh
Original file line number Diff line number Diff line change
Expand Up @@ -101,8 +101,8 @@ function get_machine() {
echo $MACHINE

# If we didn't get a node...
if [[ "$MACHINE" != "sh"* ]]
then
if [[ "$MACHINE" != "$MACHINEPREFIX"* ]]
then
echo "Tried ${ATTEMPTS} attempts!" 1>&2
exit 1
fi
Expand Down Expand Up @@ -137,7 +137,12 @@ function setup_port_forwarding() {
echo
echo "== Setting up port forwarding =="
sleep 5
echo "ssh -L $PORT:localhost:$PORT ${RESOURCE} ssh -L $PORT:localhost:$PORT -N $MACHINE &"
ssh -L $PORT:localhost:$PORT ${RESOURCE} ssh -L $PORT:localhost:$PORT -N "$MACHINE" &

if $ISOLATEDCOMPUTENODE
then
echo "ssh -L $PORT:localhost:$PORT ${RESOURCE} ssh -L $PORT:localhost:$PORT -N $MACHINE &"
ssh -L $PORT:localhost:$PORT ${RESOURCE} ssh -L $PORT:localhost:$PORT -N "$MACHINE" &
else
echo "ssh $DOMAINNAME -l $FORWARD_USERNAME -K -L $PORT:$MACHINE:$PORT -N &"
ssh "$DOMAINNAME" -l $FORWARD_USERNAME -K -L $PORT:$MACHINE:$PORT -N &
fi
}
10 changes: 9 additions & 1 deletion resume.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,12 @@ NAME="${1}"

echo "ssh ${RESOURCE} squeue --name=$NAME --user=$FORWARD_USERNAME -o "%N" -h"
MACHINE=`ssh ${RESOURCE} squeue --name=$NAME --user=$FORWARD_USERNAME -o "%N" -h`
ssh -L $PORT:localhost:$PORT ${RESOURCE} ssh -L $PORT:localhost:$PORT -N $MACHINE &

if $ISOLATEDCOMPUTENODE
then
echo "ssh -L $PORT:localhost:$PORT ${RESOURCE} ssh -L $PORT:localhost:$PORT -N $MACHINE &"
ssh -L $PORT:localhost:$PORT ${RESOURCE} ssh -L $PORT:localhost:$PORT -N $MACHINE &
else
echo "ssh $DOMAINNAME -l $FORWARD_USERNAME -K -L $PORT:$MACHINE:$PORT -N &"
ssh "$DOMAINNAME" -l $FORWARD_USERNAME -K -L $PORT:$MACHINE:$PORT -N &
fi
8 changes: 7 additions & 1 deletion sbatches/farmshare/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# Farmshare

Hi friend! These haven't been tested fully yet. Do you want to help? Please work with @vsoch!
1. Go to your home directory in rice. Type `module load singularity`.
2. On Rice,while still in your home directory, type `singularity exec library://sohams/default/farmsharejupyter:latest jupyter notebook --generate-config`
3. Next type, `singularity exec library://sohams/default/farmsharejupyter:latest jupyter notebook password`. Choose a password and verify it. This will serve as the login password for the notebooks.
4. Follow the original tutorial to setup ssh, and fill out the params.sh file by running `bash setup.sh`. Choose a port that is higher than 32768 for the tunnel to work.
5. In order to start type `bash start_farmshare.sh singularity-jupyter` for classic notebook or `bash start_farmshare.sh singularity-jupyterlab` for Jupyter Lab.
6. During establishing the tunnel to the compute node, there will be a prompt for user password and duo factor authentication.
7. See where the notebook is running is at the end of the prompt and type in a browser (http://localhost:(your chosen port number)). The default location of the notebook will be at your scratch location - /farmshare/scratch/users/yourusername
20 changes: 0 additions & 20 deletions sbatches/farmshare/singularity-exec.sbatch

This file was deleted.

14 changes: 8 additions & 6 deletions sbatches/farmshare/singularity-jupyter.sbatch
Original file line number Diff line number Diff line change
@@ -1,24 +1,24 @@
#!/bin/bash
#!/bin/bash

# Usage

# 1. Default Jupyter notebook (with your scratch to work in)
# $ bash start.sh singularity-jupyter

# 2. Default Jupyter notebook with custom working directory
# $ bash start.sh singularity-jupyter /scratch/users/<username>
# $ bash start.sh singularity-jupyter /farmshare/scratch/users/<username>

# 3. Select your own jupyter container on Sherlock!
# $ bash start.sh singularity-jupyter /scratch/users/<username> /path/to/container
# $ bash start.sh singularity-jupyter /farmshare/scratch/users/<username> /path/to/container

# 4. Or any singularity container...
# $ bash start.sh singularity /path/to/container <args>

PORT=$1
NOTEBOOK_DIR=${2:-${SCRATCH}}
CONTAINER=${3:-docker://vanessa/repo2docker}
NOTEBOOK_DIR=${2:-/farmshare/scratch/users/$USER}
CONTAINER=${3:-library://sohams/default/farmsharejupyter:latest}

export SINGULARITY_CACHEDIR=/farmshare/user_data/${USERNAME}/.singularity
export SINGULARITY_CACHEDIR=/farmshare/user_data/${USER}/.singularity
echo "Container is ${CONTAINER}"
echo "Notebook directory is ${NOTEBOOK_DIR}"
cd ${NOTEBOOK_DIR}
Expand All @@ -30,4 +30,6 @@ if [ ! -d "${HOME}/.local" ];
mkdir -p "${HOME}/.local";
fi

. /etc/profile
module load singularity/3.4.0
singularity exec --home ${HOME} --bind ${HOME}/.local:/home/username/.local ${CONTAINER} jupyter notebook --no-browser --port=$PORT --ip 0.0.0.0
35 changes: 35 additions & 0 deletions sbatches/farmshare/singularity-jupyterlab.sbatch
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#!/bin/bash

# Usage

# 1. Default Jupyter notebook (with your scratch to work in)
# $ bash start.sh singularity-jupyter

# 2. Default Jupyter notebook with custom working directory
# $ bash start.sh singularity-jupyter /farmshare/scratch/users/<username>

# 3. Select your own jupyter container on Sherlock!
# $ bash start.sh singularity-jupyter /farmshare/scratch/users/<username> /path/to/container

# 4. Or any singularity container...
# $ bash start.sh singularity /path/to/container <args>

PORT=$1
NOTEBOOK_DIR=${2:-/farmshare/scratch/users/$USER}
CONTAINER=${3:-library://sohams/default/farmsharejupyter:latest}

export SINGULARITY_CACHEDIR=/farmshare/user_data/${USER}/.singularity
echo "Container is ${CONTAINER}"
echo "Notebook directory is ${NOTEBOOK_DIR}"
cd ${NOTEBOOK_DIR}

#Create .local folder for default modules, if doesn't exist
if [ ! -d "${HOME}/.local" ];
then
echo "Creating local python modules folder to map at ${HOME}/.local";
mkdir -p "${HOME}/.local";
fi

. /etc/profile
module load singularity/3.4.0
singularity exec --home ${HOME} --bind ${HOME}/.local:/home/username/.local ${CONTAINER} jupyter-lab --no-browser --port=$PORT --ip 0.0.0.0
20 changes: 0 additions & 20 deletions sbatches/farmshare/singularity-run.sbatch

This file was deleted.

55 changes: 48 additions & 7 deletions setup.sh
Original file line number Diff line number Diff line change
@@ -1,21 +1,53 @@
#!/bin/bash
#
# Sets up parameters for use with other scripts. Should be run once.
# Sets up parameters for use with other scripts. Removes an instance of param.sh if it exists.
# Sample usage: bash setup.sh

# Can be run for any number of times
rm -r params.sh
echo "First, choose the resource identifier that specifies your cluster resoure. We
will set up this name in your ssh configuration, and use it to reference the resource (sherlock)."
echo
read -p "Resource identifier (default: sherlock) > " RESOURCE
RESOURCE=${RESOURCE:-sherlock}

if [[ "${RESOURCE}" == "sherlock" ]]
then
MACHINEPREFIX=${MACHINEPREFIX:-sh}
USE_LSOF=${USE_LSOF:-/usr/sbin/lsof}
DOMAINNAME=${DOMAINNAME:-login.sherlock.stanford.edu}
ISOLATEDCOMPUTENODE=${ISOLATEDCOMPUTENODE:-true}

elif [[ "${RESOURCE}" == "farmshare" ]]
then
MACHINEPREFIX=${MACHINEPREFIX:-wheat}
USE_LSOF=${USE_LSOF:-lsof}
DOMAINNAME=${DOMAINNAME:-rice.stanford.edu}
ISOLATEDCOMPUTENODE=${ISOLATEDCOMPUTENODE:-false}

else
echo "Since, you are not using farmshare or sherlock, please supply the domain name of your resource"
echo
read -p "Domain Name for Resource > " DOMAINNAME
echo
echo "Next, please supply the prefix of the compute nodes that are used in your cluster resource. We will use this to check for assignment of
compute node when we submit the sbatch script. If you are using sherlock or farmshare (without gpu capability), then prefixes are set for you."
echo
read -p "Compute Node Prefix identifier > " MACHINEPREFIX
echo
echo "Are the compute nodes in your HPC cluster isolated from the outside internet?"
echo
read -p "Isolation Status (type true or false) >" ISOLATEDCOMPUTENODE
echo
fi

echo
read -p "${RESOURCE} username > " FORWARD_USERNAME

echo
echo "Next, pick a port to use. If someone else is port forwarding using that
port already, this script will not work. If you pick a random number in the
range 49152-65335, you should be good."
range 49152-65335, you should be good. For farmshare, please use a port number higher than
32768."
echo
read -p "Port to use > " PORT

Expand All @@ -32,18 +64,27 @@ SHARE="/scratch/users/vsochat/share"
echo "A containershare (https://vsoch.github.io/containershare is a library of
containers that are prebuilt for you, and provided on your cluster resource. if you
are at Stanford, leave this to be the default. If not, ask your HPC administrator
about setting one up, and direct them to https://www.github.com/vsoch/containershare."
about setting one up, and direct them to https://www.github.com/vsoch/containershare.
For farmshare, leave blank to use default singularity maintained by Soham Sinha, which you will need to pull into your home directory. Check README located in sbatches/farmshare/README.md"
echo
read -p "container shared folder (default for Stanford: ${SHARE}) > " CONTAINERSHARE
CONTAINERSHARE=${CONTAINERSHARE:-${SHARE}}

echo
if [[ "${RESOURCE}" == "sherlock" ]]
then
CONTAINERSHARE=${CONTAINERSHARE:-${SHARE}}
elif [[ "${RESOURCE}" == "farmshare" ]]
then
CONTAINERSHARE=${CONTAINERSHARE:-library://sohams/default/farmsharejupyter:latest}
fi



MEM=20G

TIME=8:00:00

for var in FORWARD_USERNAME PORT PARTITION RESOURCE MEM TIME CONTAINERSHARE

for var in FORWARD_USERNAME PORT PARTITION RESOURCE MEM TIME CONTAINERSHARE MACHINEPREFIX DOMAINNAME USE_LSOF ISOLATEDCOMPUTENODE
do
echo "$var="'"'"$(eval echo '$'"$var")"'"'
done >> params.sh
7 changes: 4 additions & 3 deletions start.sh
Original file line number Diff line number Diff line change
Expand Up @@ -79,10 +79,11 @@ echo "== Connecting to notebook =="
print_logs

echo

instruction_get_logs
echo

echo
echo "== Instructions =="
echo "1. Password, output, and error printed to this terminal? Look at logs (see instruction above)"
echo "2. Browser: http://sh-02-21.int:$PORT/ -> http://localhost:$PORT/..."
echo "2. Browser: http://$MACHINE:$PORT/ -> http://localhost:$PORT/..."
echo "3. To end session: bash end.sh ${NAME}"

0 comments on commit eb91534

Please sign in to comment.