diff --git a/.github/workflows/static.yml b/.github/workflows/static.yml index b9c9160..b715553 100644 --- a/.github/workflows/static.yml +++ b/.github/workflows/static.yml @@ -4,7 +4,7 @@ name: Deploy static content to Pages on: # Runs on pushes targeting the default branch push: - branches: ["gh-pages"] + branches: ["main", "prod", "gh-pages"] # Allows you to run this workflow manually from the Actions tab workflow_dispatch: diff --git a/README.md b/README.md index 129088e..1f0eceb 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ and the Rubin observatory. The S3DF infrastructure is optimized for data analytics and is characterized by large, massive throughput, high concurrency storage systems. -**December 26th 8:00am PST: ALL S3DF services are currently DOWN/unavailable. We are investigating and will provide an update later today.** +**January 6th 8:40am PST: All S3DF services are back UP. Users with k8s workloads should check for any lingering issues (stale file handles) and report to s3df-help@slac.stanford.edu. Thank you for your patience.** ## Quick Reference diff --git a/changelog.md b/changelog.md index 4976cd1..e357531 100644 --- a/changelog.md +++ b/changelog.md @@ -43,13 +43,16 @@ If critical issues are not responded to within 2 hours of reporting the issue pl ### Current +**January 6th 8:40am PST: All S3DF services are back UP. Users with k8s workloads should check for any lingering issues (stale file handles) and report to s3df-help@slac.stanford.edu. Thank you for your patience.** + ### Upcoming ### Past |When |Duration | What | | --- | --- | --- | -|Dec 10 2024|Ongoing (unplanned)|StaaS GPFS disk array outage (partial /gpfs/slac/staas/fs1 unavailability)| +|Dec 26 2024| 1 days (unplanned)|One of our core networking switches in the data center failed and had to be replaced. The fall-out from this impacted other systems and services on S3DF. Staff worked through the night on stabilization of the network devices and connections as well as recovery of the storage subsystem.| +|Dec 10 2024|(unplanned)|StaaS GPFS disk array outage (partial /gpfs/slac/staas/fs1 unavailability)| | Dec 3 2024 | 1 hr (planned) | Mandatory upgrade of the slurm controller, the database, and the client components on all batch nodes, kubernetes nodes, and interactive nodes. |Nov 18 2024|8 days (unplanned)|StaaS GPFS disk array outage (partial /gpfs/slac/staas/fs1 unavailability)| |Oct 21 2024 |10 hrs (planned)| Upgrade to all S3DF Weka clusters. We do NOT anticipate service interruptions.