Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workspace: Add homefs #607

Merged
merged 4 commits into from
Oct 26, 2024
Merged

Workspace: Add homefs #607

merged 4 commits into from
Oct 26, 2024

Conversation

ConnorNelson
Copy link
Member

@ConnorNelson ConnorNelson commented Oct 25, 2024

This PR introduces homefs, our solution to improve the performance of persistent home directories, favoring btrfs (that is capable of migrating around nodes) over nfs.


Some TODOs (probably to be addressed largely by other PRs):

  • Migrate production homes from ext4 to btrfs
  • Remove nfs logic
  • Restore as_user workspace support
  • Support periodically synchronizing incremental snapshots back to the main node
  • Support home migration

@ConnorNelson
Copy link
Member Author

Closes #572.

@ConnorNelson
Copy link
Member Author

ConnorNelson commented Oct 25, 2024

This is how the migration was performed:

#!/bin/bash

set -e

log_file="/data/log"
exec > >(tee -a "$log_file") 2>&1

parallel_jobs=20

users=$(ls /tank/homes | sort -n)

log_error() {
    user=$1
    touch "/data/homes/$user.error"
}

process_user() {
    user=$1
    date
    echo "Processing user: $user"

    tmp_mount="/data/homes/tmp_$user"

    mkdir -p "$tmp_mount"

    mount /tank/homes/$user "$tmp_mount" || { echo "Failed to mount for $user"; log_error "$user"; return 1; }

    if [[ ! -d "/data/homes/$user" ]]; then
        btrfs subvolume create /data/homes/$user || { echo "Failed to create subvolume for $user"; log_error "$user"; return 1; }
    fi

    if [[ ! -d "/data/homes/$user/snapshots" ]]; then
        btrfs subvolume create /data/homes/$user/snapshots || { echo "Failed to create snapshots subvolume for $user"; log_error "$user"; return 1; }
    fi

    if [[ ! -d "/data/homes/$user/active" ]]; then
        btrfs subvolume create /data/homes/$user/active || { echo "Failed to create active subvolume for $user"; log_error "$user"; return 1; }
    fi

    btrfs qgroup limit 1G /data/homes/$user/active || { echo "Failed to set qgroup limit for $user"; log_error "$user"; return 1; }

    rsync -aH --sparse --delete --ignore-errors --info=progress2 "$tmp_mount/" /data/homes/$user/active/ || { echo "Rsync failed for $user"; log_error "$user"; return 1; }

    umount "$tmp_mount" || { echo "Failed to unmount for $user"; log_error "$user"; return 1; }

    rmdir "$tmp_mount" || { echo "Failed to remove temp mount for $user"; log_error "$user"; return 1; }
}

export -f process_user
export -f log_error

echo "$users" | xargs -n 1 -P $parallel_jobs -I {} bash -c 'process_user "$@"' _ {}

echo -e "\nAll users processed successfully."

We --ignore-errors because some paths are too long (file system mazes for the race condition module). We accept that some of these files are not migrated. Additionally --sparse is critical, because many files (like core dumps) are sparse, and not transferring them sparsely would consume more data than necessary (and in some cases go above the 1GB quota).

@ConnorNelson ConnorNelson merged commit 4ec1be8 into master Oct 26, 2024
1 check passed
@ConnorNelson ConnorNelson deleted the feat/homefs branch October 26, 2024 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant