Merge pull request #126 from johanherman/devel-root

WIP: Preparing for parallel deployments
NationalGenomicsInfrastructure · Sep 6, 2016 · 5e65da0 · 5e65da0
2 parents 5879612 + b32797b
commit 5e65da0
Show file tree

Hide file tree

Showing 22 changed files with 262 additions and 100 deletions.
diff --git a/.gitignore b/.gitignore
@@ -2,3 +2,6 @@
 *.retry
 pontus.larsson_medsci.uu.se.key
 charon_credentials.yml
+statusdb_creds.yml
+tarzan_cert.pem
+tarzan_key.pem
diff --git a/README.md b/README.md
@@ -18,10 +18,6 @@ Activate the environment using source `/lupus/ngi/irma3/ansible-env/bin/activate
 
 Install ansible with `pip install ansible`
 
-Download Anaconda and install it to `/lupus/ngi/sw/anaconda`
-
-Manually set Anaconda's permissions with `chmod -R g+rwX,o+rX /lupus/ngi/sw/anaconda`
-
 Enable rsync functionality by using `pip install pexpect`
 
 Install cpanm into the Ansible environment (which is already in $PATH) so that we can install Perl packages locally: 
@@ -41,30 +37,35 @@ The following files need to be present on irma3 in order to successfully deploy
 
 - A valid GATK key placed under `/lupus/ngi/irma3/deploy/files`. The filename must be specified in the gatk_keyvariable in `host_vars/127.0.0.1/main.yml`. 
 
-- A `charon_credentials.yml` file placed under `/lupus/ngi/irma3/deploy/host_vars/127.0.0.1/` listing the variables `charon_base_url`, `charon_api_token_upps` and `charon_api_token_sthlm`
+- A `charon_credentials.yml` file placed under `/lupus/ngi/irma3/deploy/host_vars/127.0.0.1/` listing the variables `charon_base_url_{stage,prod}`, `charon_api_token_upps_{stage,prod}` and `charon_api_token_sthlm_{stage,prod}`
 
 - A valid `statusdb_creds.yml` access file placed under `/lupus/ngi/irma3/deploy/files`. Necessary layout is described at https://github.com/SciLifeLab/statusdb
 
+- Valid SSL certificates for the web proxyunder `/lupus/ngi/irma3/deploy/files` (see `roles/tarzan/README.md` for details) 
+
 ###Typical deployment
 
-Clone the repository to `/lupus/ngi/irma3/devel` and develop your scripts.
+TODO: Come back and update these instructions later. 
 
-Create your own virtual environment for developing, i.e: `conda create -n myVenv python=2.7`
+Fork the repository https://github.com/NationalGenomicsInfrastructure/irma-provision to your private Github repo. Clone this private repository to `/lupus/ngi/irma3/devel` and develop your scripts in a new feature branch. 
 
-Alter `{{ ngi_pipeline_venv }}` under `host_vars/127.0.0.1/main.yml` to match your environment's name.
+Test deploy your roles/playbook changes with e.g. `ansible-playbook install.yml`. This will install your development run in `/lupus/ngi/irma3/devel-root/<username>-<branch_name>`. 
 
-Once the features have been approved, `git pull` them into `/lupus/ngi/irma3/deploy`
+When you are satisfied with your changes, create a pull request from your feature branch into upstream irma-provisioning's master branch. 
 
-Make sure the target is somewhere under `/lupus/ngi/`. Some folders (currently only `/lupus/ngi/irma3/` and 
-`/lupus/ngi/resources/piper/gatk_bundle`) should not be used as targets as they are set up to be ignored by the rsync.
+Once the feature has been approved, or after you collected a bunch of merged features, make a new Github release and tag (https://github.com/NationalGenomicsInfrastructure/irma-provision/releases/new) for the master branch. Use the convention to name the tag according to `v<MAJOR>.<MINOR>-beta.<PATCH>`. E.g. if the latest released version is `v1.2.5`, and we've discovered a bug which we now want to test, we'll create a staging version called `v1.2-beta.6`. Click the box "This is a pre-release" and add appropriate description to the pre-release. 
 
-Run the deployment script, for instance `ansible-playbook install.yml`
+Now go to `/lupus/ngi/irma3/deploy` and do a `git fetch --tags && git checkout tags/v1.2-beta.6` and deploy it to staging with `ansible-playbook install.yml -e deployment_environment=staging`. This will install your run under `/lupus/ngi/staging/v1.2-beta.6/` and symlink `/lupus/ngi/staging/latest` to it, for easier access. 
 
-Manually place any additional files that need to be synced over under `/lupus/ngi/`
+Run `python sync.py staging`  to rsync the staged environment from irma3 to irma1. 
 
-If you don't want your environment synced to irma1, remove it.
+Login to the Irma cluster as your personal user and then run `source /lupus/ngi/staging/latest/conf/sourceme_<site>.sh && source activate NGI` (where `site` is `upps` or `sthlm` depending on location). For convenience add to your personal file bash init file `~/.bashrc`. This will load the staging environment for your user with the appropriate staging variables set. 
 
-Run `python sync.py <remote dest>` to rsync all files under `/lupus/ngi/` from irma3 to irma1. If no directory is given the default is `/lupus/ngi/`
+When the staged environment has been verified to work OK (TODO: add test protocol, manual or automated sanity checks) proceed with making a production release. In our case we would therefore now create the tag and release called `v1.2.6`. 
+
+We can now, still standing in `/lupus/ngi/irma3/deploy`, do a `git fetch --tags && git checkout tags/v1.2.6 && ansible-playbook install.yml -e deployment_environment=production`. This will install everything under `/lupus/ngi/production/v1.2.6` and the symlink `/lupus/ngi/production/latest` pointing to it. 
+
+Run `python sync.py production` to rsync all files under `/lupus/ngi/production` from irma3 to irma1. 
 
 ###Manual initiations on irma1
 
@@ -74,6 +75,8 @@ Run `/lupus/ngi/resources/create_ngi_pipeline_dirs.sh <project_name>` once per p
 
 Run `/lupus/ngi/sw/piper/gen_GATK_ref.sh` one time ever to generate the required GATK indexes to run piper.
 
+Add `source /lupus/ngi/production/latest/conf/sourcme_<site>.sh && source activate NGI`, where `site` can be `upps` or `sthlm`, to each functional account's bash init file `~/.bashrc`. 
+
 ###Quick integrity verification
 
 Run `source /lupus/ngi/conf/sourceme_<SITE>.sh` where <site> is upps to initialize variables funk_004, and sthlm to initialize funk_006 variables respectively.

diff --git a/bootstrap/bashrc b/bootstrap/bashrc
@@ -19,9 +19,10 @@ mkdir -p /lupus/ngi/irma3/log
 #alias ansibleenv='source /lupus/ngi/irma3/ansible-env/bin/activate'
 
 # We're using Anaconda when setting up the NGI pipeline environment. 
-# Perhaps not necessary. 
-export PATH="/lupus/ngi/sw/anaconda/bin:$PATH"
-#alias ngienv='source activate NGI'
+# Perhaps not necessary.
+# TODO: Remove this when we know that Anaconda install/env works
+# properly from within Ansible playbooks.
+#export PATH="/lupus/ngi/sw/anaconda/bin:$PATH"
 
 # Force the user to have ngi-sw as gid (inherits the parent environment)
 # This needs to be the last thing to run 

diff --git a/host_vars/127.0.0.1/main.yml b/host_vars/127.0.0.1/main.yml
@@ -3,12 +3,18 @@
 
 # root_path should be changed to suitable value when e.g. deploying locally
 # for a developer
-root_path: /lupus/ngi/
+#root_path: /lupus/ngi/
+anaconda_path: "{{ root_path }}/sw/anaconda"
+ansible_path: "/lupus/ngi/irma3/ansible-env/" 
+default_env: "/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sw/uppmax/bin"
+anaconda_env:
+  PATH: "{{ anaconda_path }}/bin:{{ ansible_path }}/bin:{{ default_env }}" # this hopefully gets updated by anaconda role
 
 # TODO: This variable has to be set to something more suitable when 
 # deploying as a real user with a real home directory. This is 
 # only used when building Piper atm, and will not be necessary 
 # when Piper soon is available in the module system. 
+# FIXME: Redo piper so that we use $HOME now when we've got pica mounted? 
 home_path: /tmp 
 
 # Name of the script that should be sourced from the user to get the 
@@ -28,11 +34,11 @@ ngi_resources: "{{ root_path }}/resources/"
 
 ngi_pipeline_sthlm_delivery: ngi2016003
 ngi_pipeline_upps_delivery: ngi2016001
-ngi_sthlm_softlinks: "/proj/{{ ngi_pipeline_sthlm_delivery }}/nobackup/NGI/softlinks"
-ngi_upps_softlinks: "/proj/{{ ngi_pipeline_upps_delivery }}/nobackup/NGI/softlinks"
+ngi_sthlm_softlinks: "{{ proj_root }}/{{ ngi_pipeline_sthlm_delivery }}/nobackup/NGI/softlinks"
+ngi_upps_softlinks: "{{ proj_root }}/{{ ngi_pipeline_upps_delivery }}/nobackup/NGI/softlinks"
 
-ngi_pipeline_sthlm_path: "/proj/{{ ngi_pipeline_sthlm_delivery }}/private/ngi_pipeline/"
-ngi_pipeline_upps_path: "/proj/{{ ngi_pipeline_upps_delivery }}/private/ngi_pipeline/"
+ngi_pipeline_sthlm_path: "{{ proj_root }}/{{ ngi_pipeline_sthlm_delivery }}/private/ngi_pipeline/"
+ngi_pipeline_upps_path: "{{ proj_root }}/{{ ngi_pipeline_upps_delivery }}/private/ngi_pipeline/"
 
 ngi_pipeline_log_sthlm: "{{ ngi_pipeline_sthlm_path }}/log/ngi_pipeline.log"
 ngi_pipeline_log_upps: "{{ ngi_pipeline_upps_path }}/log/ngi_pipeline.log"

diff --git a/install.yml b/install.yml
@@ -3,6 +3,28 @@
 - name: Deploy irma software 
   hosts: 127.0.0.1
   connection: local
+
+  # These three variables needs to be overriden when calling ansible-playbook. 
+  #
+  # Call with e.g. "ansible-playbook install.yml -e deployment_environment=staging -e deployment_version=foo". 
+  # If devel then the version will always be set to USERNAME_BRANCHNAME. 
+  # If staging or production then the playbook will halt if the folder version already exists. 
+  # I.e. the user will then manually have to remove the directory, OR add "-e deployment_override=true"
+  # to deploy into an already existing folder.
+  # 
+  # /lupus/ngi/production/<latest|current> will link to /lupus/ngi/production/<github release>
+  # /lupus/ngi/staging/ will contain a folder wild-wild-west which will be world writeable ON THE RECIEVING end. I.e. the 
+  # sync script will change the permissions? This is to make sure that not everyone that is able to login to irma3 can 
+  # upload data into the cluster. 
+  vars: 
+    deployment_environment: devel # should be: 1) production, 2) staging, 3) devel
+    deployment_version: default # should be: 1) autogenerated for devel, 2) commit-hash for staging, 3) repo release tag for production
+    deployment_override: false # Set to true if you want to deploy into existing production/staging environment
+
+  pre_tasks:
+    - include: tasks/pre-install.yml
+      tags: always
+
   roles: 
     - { role: ngi_pipeline, tags: ngi_pipeline }
     - { role: piper, tags: piper }
@@ -16,26 +38,8 @@
     - { role: ngi-rnaseq, tags: ngi-rnaseq }
     - { role: nougat, tags: nougat }
 
-# This is an ugly hack that sets the permissions correctly to g+rwX,o=rX
-# for all files that is owned by the current user that is running the playbook. 
-# We append permissions to g so that we do not override setgid flag set 
-# on some places (+s). We define permissions for others via o= as we
-# know that they shouldn't have any other permissions. 
-#
-# Note that this by purpose will miss changing the permissions of files
-# owned by an other user, as we do not want to have the playbook cluttered
-# with a lot of errors when the permission change fails.
-#
-# It is therefore expected that this script will always be run when someone is 
-# running the playbook.
-- hosts: 127.0.0.1
-  connection: local 
-  tasks:
-    - name: set correct file permission for everything owned by current user 
-      shell: "find /lupus/ngi/{conf,sw,resources} -user `whoami` -exec chmod g+rwX,o=rX {} \\;"
-      tags: [ 'ngi_pipeline', 'piper', 'func_accounts', 'tarzan', 'taca', 'ngi_reports', 
-              'multiqc', 'tarzan', 'arteria-checksum', 'arteria-siswrap', 'ngi-rnaseq', 'nougat' ]
-      # NB: Setting the tags explicitely is only necessary if we want to be able to 
-      # run any single role above explicetely. If we only want to run all roles every
-      # time then mentioning the tags here wouldn't be needed. 
+  environment: "{{ anaconda_env }}"
 
+  post_tasks:
+    - include: tasks/post-install.yml
+      tags: always
diff --git a/roles/arteria-checksum-ws/defaults/main.yml b/roles/arteria-checksum-ws/defaults/main.yml
@@ -12,7 +12,7 @@ arteria_checksum_environment: production
 # paths in the tasks.
 #
 # NB. The log dirs need to be created manually on destination cluster. 
-arteria_checksum_monitored_path: "/proj/{{ ngi_pipeline_upps_delivery }}/incoming"
+arteria_checksum_monitored_path: "{{ proj_root }}/{{ ngi_pipeline_upps_delivery }}/incoming"
 arteria_checksum_env_root: "{{ root_path }}/sw/arteria/checksum_venv/"
 arteria_checksum_src_path: "{{ root_path }}/sw/arteria/checksum_src/"
 arteria_checksum_config_root: "{{ ngi_pipeline_conf }}/arteria/checksum/"

diff --git a/roles/arteria-siswrap-ws/defaults/main.yml b/roles/arteria-siswrap-ws/defaults/main.yml
@@ -20,7 +20,7 @@ arteria_siswrap_config_root: "{{ ngi_pipeline_conf }}/arteria/siswrap"
 arteria_siswrap_app_config: "{{ arteria_siswrap_config_root }}/app.config"
 arteria_siswrap_logger_config: "{{ arteria_siswrap_config_root }}/logger.config"
 arteria_siswrap_log: "{{ ngi_pipeline_upps_path }}/log/arteria/siswrap-ws"
-runfolder_path: "/proj/{{ ngi_pipeline_upps_delivery }}/incoming/"
+runfolder_path: "{{ proj_root }}/{{ ngi_pipeline_upps_delivery }}/incoming/"
 
 arteria_siswrap_port_prod: 10430
 arteria_siswrap_port_stage: 10431

diff --git a/roles/arteria-siswrap-ws/tasks/sisyphus.yml b/roles/arteria-siswrap-ws/tasks/sisyphus.yml
@@ -36,6 +36,11 @@
 - name: install PerlIO::gzip
   cpanm: name=PerlIO::gzip locallib="{{ perllib_dest }}"
 
+  # TODO: this is a pdl dependency that fails 1/32 tests, 
+  # so ignoring this for now due to problems getting it to pass. 
+- name: install Inline::C
+  cpanm: name=Inline::C locallib="{{ perllib_dest }}" notest=yes
+
 - name: install PDL
   cpanm: name=PDL locallib="{{ perllib_dest }}"
 

diff --git a/roles/func_accounts/templates/crontab_upps.j2 b/roles/func_accounts/templates/crontab_upps.j2
@@ -7,9 +7,9 @@ SHELL=/bin/bash
 22 7,19 * * *   bash {{ ngi_resources }}/project_sizes.sh &> /dev/null
 
 # (SNP&SEQ) copy the aggregate reports 
-42 22 * * *     find /proj/{{ ngi_pipeline_upps_delivery }}/nobackup/NGI/analysis_ready/ANALYSIS/*/piper_ngi/delivery/reports/ -name "*_aggregate_report.csv" -exec cp {} /proj/{{ ngi_pipeline_upps_delivery }}/logs/ \;
+42 22 * * *     find {{ proj_root }}/{{ ngi_pipeline_upps_delivery }}/nobackup/NGI/analysis_ready/ANALYSIS/*/piper_ngi/delivery/reports/ -name "*_aggregate_report.csv" -exec cp {} {{ proj_root }}/{{ ngi_pipeline_upps_delivery }}/logs/ \;
 # (SNP&SEQ) copy the version reports
-45 22 * * *     find /proj/{{ ngi_pipeline_upps_delivery }}/nobackup/NGI/analysis_ready/ANALYSIS/*/piper_ngi/logs/ -name "version_report.txt" |while read f; do p=`echo "$f" |sed -re 's/^.*ANALYSIS\/([^\/]+)\/.*$/\1/'`; rsync -ac "$f" "/proj/{{ ngi_pipeline_upps_delivery }}/logs/${p}_version_report.txt"; done
+45 22 * * *     find {{ proj_root }}/{{ ngi_pipeline_upps_delivery }}/nobackup/NGI/analysis_ready/ANALYSIS/*/piper_ngi/logs/ -name "version_report.txt" |while read f; do p=`echo "$f" |sed -re 's/^.*ANALYSIS\/([^\/]+)\/.*$/\1/'`; rsync -ac "$f" "{{ proj_root }}/{{ ngi_pipeline_upps_delivery }}/logs/${p}_version_report.txt"; done
 
 # restart supervisord if it has died for some reason
 11 * * * *      bash {{ ngi_resources }}/start_supervisord_upps.sh &> /dev/null

diff --git a/roles/ngi_pipeline/defaults/main.yml b/roles/ngi_pipeline/defaults/main.yml
@@ -1,6 +1,11 @@
 --- 
 
-anaconda_path: /lupus/ngi/sw/anaconda
+anaconda_file: Anaconda2-4.1.1-Linux-x86_64.sh
+anaconda_checksum: a3586948f841f3ed9639375be6c0f932286156e31a45d1b418368294ad0b31f1
+anaconda_url: "http://repo.continuum.io/archive/{{ anaconda_file }}"
+anaconda_src: "/lupus/ngi/irma3/{{ anaconda_file }}"
+#anaconda_path: "{{ root_path }}/sw/anaconda"
+
 upps_config: irma_ngi_config_upps.yaml
 sthlm_config: irma_ngi_config_sthlm.yaml
 

diff --git a/roles/ngi_pipeline/tasks/main.yml b/roles/ngi_pipeline/tasks/main.yml
@@ -1,5 +1,13 @@
 ---
 
+- name: Download anaconda
+  get_url: url={{ anaconda_url }} dest={{ anaconda_src }} mode=ug+rwx checksum="sha256:{{ anaconda_checksum }}"
+
+- name: Install anaconda
+  shell: "{{ anaconda_src }} -b -p {{ anaconda_path }}"  
+  args: 
+    creates: "{{ anaconda_path }}/LICENSE.txt"
+
 - name: Fetch ngi_pipeline from github 
   git: repo={{ ngi_pipeline_repo }} 
        dest={{ ngi_pipeline_dest }}
@@ -24,6 +32,9 @@
 - name: Create ngi pipeline conf directory 
   file: path="{{ ngi_pipeline_conf }}" state=directory mode=g+s
 
+- name: Create ngi_resources folder 
+  file: name={{ ngi_resources }} state=directory mode=g+s
+
 # Set Uppsala specific variables
 - set_fact:
     ngi_pipeline_db: "{{ ngi_pipeline_db_upps }}"

diff --git a/roles/ngi_pipeline/templates/create_ngi_pipeline_dirs.sh.j2 b/roles/ngi_pipeline/templates/create_ngi_pipeline_dirs.sh.j2
@@ -14,12 +14,11 @@ if [ $# -ne 1 ]; then
         exit 1
 fi
 
-mkdir -p /proj/$1/private/ngi_pipeline/log
-mkdir -p /proj/$1/private/ngi_pipeline/db
-mkdir -p /proj/$1/nobackup/NGI/softlinks
-mkdir -p /proj/$1/private/ngi_pipeline/log/supervisord
-ln -s {{ ngi_pipeline_dest }}/DELIVERY.README.txt /proj/$1/nobackup/NGI/softlinks/DELIVERY.README.txt
-ln -s {{ ngi_pipeline_dest }}/scripts/applyRecalibration.sh /proj/$1/nobackup/NGI/softlinks/applyRecalibration.sh
-ln -s {{ ngi_pipeline_dest }}/scripts/bam2fastq.sh /proj/$1/nobackup/NGI/softlinks/bam2fastq.sh
+mkdir -p {{ proj_root }}/$1/private/ngi_pipeline/log
+mkdir -p {{ proj_root }}/$1/private/ngi_pipeline/db
+mkdir -p {{ proj_root }}/$1/nobackup/NGI/softlinks
+mkdir -p {{ proj_root }}/$1/private/ngi_pipeline/log/supervisord
+ln -s {{ ngi_pipeline_dest }}/DELIVERY.README.txt {{ proj_root }}/$1/nobackup/NGI/softlinks/DELIVERY.README.txt
+ln -s {{ ngi_pipeline_dest }}/scripts/applyRecalibration.sh {{ proj_root }}/$1/nobackup/NGI/softlinks/applyRecalibration.sh
+ln -s {{ ngi_pipeline_dest }}/scripts/bam2fastq.sh {{ proj_root }}/$1/nobackup/NGI/softlinks/bam2fastq.sh
 
-find /proj/$1/private/ngi_pipeline -ls
diff --git a/roles/ngi_pipeline/templates/irma_ngi_config.yaml.j2 b/roles/ngi_pipeline/templates/irma_ngi_config.yaml.j2
@@ -10,8 +10,8 @@ environment:
     ngi_scripts_dir: {{ ngi_pipeline_dest }}/scripts 
     conda_env: NGI
     flowcell_inbox:
-            - /proj/{{ ngi_pipeline_sthlm_delivery }}/incoming
-            - /proj/{{ ngi_pipeline_upps_delivery }}/incoming
+            - {{ proj_root }}/{{ ngi_pipeline_sthlm_delivery }}/incoming
+            - {{ proj_root }}/{{ ngi_pipeline_upps_delivery }}/incoming
 # TODO: This QOS flag is probably not used/needed any longer. 
 #       Enable later in the future if required. 
 #slurm:
@@ -66,7 +66,7 @@ analysis:
     top_dir: nobackup/NGI
     sthlm_root: {{ ngi_pipeline_sthlm_delivery }}
     upps_root: {{ ngi_pipeline_upps_delivery }}
-    base_root: /proj
+    base_root: {{ proj_root }}
 
 qc:
     load_modules:

diff --git a/roles/ngi_pipeline/templates/sourceme_common.sh.j2 b/roles/ngi_pipeline/templates/sourceme_common.sh.j2
@@ -8,3 +8,9 @@ export CHARON_BASE_URL={{ charon_base_url }}
 # pipeline outputing data with "." as decimal output instead of Swedish ",". 
 export LC_ALL=en_US.UTF-8
 export LANG=en_US.UTF-8
+
+# Append the Bash prompt with the version of the Irma environment/provisioning
+# that we're running. 
+SCRIPT_PATH=`realpath $0`
+IRMA_ENV=`echo $SCRIPT_PATH | cut -d/ -f4,5`
+export PS1="(irma env: $IRMA_ENV) $PS1"