pub / www.jayvii.de

My personal website
git clone https://https://src.jayvii.de/pub/www.jayvii.de.git
Home | Log | Files | Exports | Refs | Submodules | RSS

backups.md (14245B)


      1 ---
      2 author: "JayVii"
      3 title: "Encrypted Backup via rsync"
      4 date: "2023-04-07"
      5 summary: "How I create atomic encrypted backups via gocryptfs and rsync"
      6 tags: ["backup", "encryption", "tech"]
      7 ---
      8 
      9 > **Update 2024-04-28:**
     10 >
     11 > The resulting script from this blog post can be found on my git page:
     12 > [rsync_encrypted_backup](https://src.jayvii.de/pub/rsync_encrypted_backup/)
     13 
     14 Backups have been somewhat of a pain for me for quite a while, as I could never
     15 find a suitable, easy to manage and easy to recover option for my private
     16 computer.
     17 
     18 My goal was to create a simple off-site backup routine (i.e. "the cloud"), which
     19 would be easy to recover from, suitably fast - ideally with atomic/delta
     20 updates - and reasonably secure, i.e. strong default encryption like
     21 AES256-level.
     22 
     23 I tried several options like `7z` with encryption and low (or no) compression
     24 rate, sending a whole ZIP archive to a remote storage or even updating existing
     25 archives. However, this of course turned out to be rather cumbersome, prone to
     26 write-errors / connection issues and extremely slow.
     27 
     28 The next approach did work reasonably well, and is what I want to present here.
     29 I am sure there is still room for improvement, so if you have any suggestions,
     30 feel free to send me a DM in the Fediverse or an e-mail.
     31 
     32 The well-known `rsync` tool is a natural candidate for atomic backups in the
     33 Linux-world. It can sync directories with all sort of remote end-points,
     34 including (S)FTP, WebDAV, etc. It keeps ACLs, modes and ownership of files
     35 intact and is relatively fast, light on system resources and can do syncing both
     36 ways (i.e. it may also be used to restore your files). However, `rsync` does not
     37 support encryption while syncing your files.
     38 
     39 So in order to encrypt backups within `rsync`, they have to be encrypted
     40 **before** transmission, ideally in real-time and without impacting read-speed
     41 all that much.
     42 
     43 ## File Encryption
     44 
     45 A close to perfect solution for this task is `gocryptfs`, the spiritual
     46 successor of `encryptfs`. It is an encrypted overlay-file-system, that
     47 (crucially) supports "reverse-mode", is extremely fast and utilizes strong
     48 encryption methods.
     49 
     50 What this means exactly in the context of backups is, that we can mount
     51 the directory we want to back up (e.g. our home-directory) in an encrypted,
     52 real-time updated form, and sync the encrypted versions of all files rather than
     53 the original unencrypted versions. The aforementioned "reverse-mode" is useful,
     54 because it mounts a pre-existing, unencrypted directory as encrypted volume.
     55 
     56 So first, let's start with creating a setup for the encrypted file-system. This
     57 has to be done only once and creates the metadata and encryption heads for the
     58 volume. Once this is done, we only need to mount the encrypted volume in the
     59 future:
     60 
     61 ```{.sh}
     62 #!/usr/bin/env bash
     63 
     64 gocryptfs \
     65     --init \ # initilise the volume
     66     --reverse \ # use "reverse mode"
     67     --plaintextnames \ # do not obfuscate names of files and directories
     68     "$HOME" # target directory. Here: our home-folder
     69 ```
     70 
     71 This process will ask you to set an encryption passphrase as well as provide you
     72 with a master restore key. **BACK THIS KEY UP SOMEWHERE SAFE AND IN SEVERAL
     73 PLACES, BOTH DIGITALLY AS WELL AS PHYSICALLY!**
     74 
     75 The metadata file will be stored in your unencrypted directory as
     76 `.gocryptfs.reverse.conf` and in the encrypted storage as `gocryptfs.conf`
     77 (unencrypted). Make sure to store this somewhere secure too, as it is required
     78 to decrypt the storage in case you need to restore your backups.
     79 
     80 From now on, we may mount the directory in its encrypted from:
     81 
     82 ```{.sh}
     83 #!/usr/bin/env bash
     84 
     85 # create temporary directory as encrypted mountpoint
     86 ENCRYPTED_DIR=$(mktemp -d)
     87 
     88 # mount home-directory to newly created mountpoint
     89 gocryptfs \
     90     --ro \ # mount in read-only mode
     91     --reverse \ # use "reverse mode"
     92     "$HOME" \ # unencrypted directory to be backed up
     93     "$ENCRYPTED_DIR" && # temporary mount point
     94     echo "$HOME has been mounted in encrypted form to $ENCRYPTED_DIR" ||
     95     echo "Mounting $HOME to $ENCRYPTED_DIR failed!"
     96 ```
     97 
     98 In your encrypted directory, you will now find your entire home-directory in
     99 encrypted form. The reason we used `--plaintextnames` before was, that it makes
    100 the recovery process a lot easier, if you can actually identify the files and
    101 folders from their names (ofc. their contents are encrypted). If you do not need
    102 this feature, because you'd recover the entire directory, rather than only
    103 partials of it, you may consider removing that parameter when creating the
    104 volume.
    105 
    106 The `--ro` parameter sets read-only permissions for the encrypted mount, meaning
    107 that you can not write new files to the encrypted volume. Importantly, writing
    108 to the unencrypted directory is still possible. Doing so will also update the
    109 encrypted directory in real-time. The parameter may protect our directory from
    110 technical or user mistakes, however, i.e. if we by accident use the reverse
    111 order of target and source in `rsync`...
    112 
    113 If we want to recover a backup later-on, of course we do need to write
    114 permissions in the encrypted volume. This is mentioned later in this blog post
    115 again.
    116 
    117 ## File Transmission
    118 
    119 Next, we can finally back up our encrypted directory via `rsync`. Let's first
    120 talk about the parameters that might be useful for backups. Personally, I want
    121 to exclude several directories in the backup, like the "Downloads", ".cache" and
    122 similar folders. `rsync` can even use wild-cards here, so you can exclude every
    123 `.git` folder or specific file-types (if they have the appropriate file-ending).
    124 This is of course not possible (or a lot harder...) if you skipped the
    125 `--plaintextnames` when creating the encrypted volume, as all file- and
    126 directory names are obfuscated without it.
    127 
    128 We want to keep file-permissions and file-owners, so the `--archive` parameter
    129 is handy here. Since we want to see what is happening during the procedure, the
    130 `--verbose` and `--progress` parameters are useful as well. Additionally, files
    131 that we have deleted from our system should also disappear from the backup, next
    132 time we sync them up. Ideally, this should happen *after* new files are
    133 transferred via the `--delete-delay`.
    134 
    135 Because I back up multiple devices to the same network storage, it is a good
    136 idea to name the target folder after the hostname of the device. Furthermore,
    137 even though I do atomic backups, I want to keep several versions of my
    138 backup-files. So I back up to different folders on the remote storage solutions,
    139 based on time. To be more exact, I append the current month to the target
    140 directory's name, such that I always have the past 12 versions of my backups
    141 (considering that I run backups once every month.): `${HOSTNAME}_$(date +%m)/`.
    142 
    143 However, if you want to keep fewer past versions, there is a little trick via
    144 the [modulo](https://en.wikipedia.org/wiki/Modulo) of the current month. Say,
    145 you want to keep only the past 3 versions, you can do `$(($(date +%m) % 3))`,
    146 which will divide the number of the current month (i.e. 5 for may) by 3 and give
    147 you the remainder of 2. So over the course of a year, this calculation would
    148 give you 1 in January, 2 in February, 0 in March, 1 in April, 2 in May, 0 in
    149 June and so on. This in turn means that you'll always keep the past 3 months as
    150 different versions of your backup. Adjust this value to your needs and the size
    151 of your remote storage.
    152 
    153 The whole transfer procedure looks like this:
    154 
    155 ```{.sh}
    156 #!/usr/bin/env bash
    157 
    158 # How many versions should be kept?
    159 TARGET_VERSION=$(($(date +%m) % 3))
    160 
    161 # define target backup storage (here: remote SFTP storage)
    162 TARGET="user[AT]my[DOT]remote.backup:${HOSTNAME}_${TARGET_VERSION}/"
    163 
    164 # define directories to be excluded
    165 EXCL="--exclude=Downloads/* --exclude=.cache/* --exclude=.var/* --exclude=.local/share/Trash/* --exclude=*.git/* --exclude=.davfs2/*"
    166 
    167 # back up directory
    168 rsync \
    169     --archive \ # keep ownership information in tact
    170     --verbose \ # print more information during transmission
    171     --progress \ # show progress of the transmission
    172     --delete-delay \ # remove deleted files from target after sync
    173     ${EXCL} \ # set exclusion parameters from above
    174     "$ENCRYPTED_DIR" \ # source (encrypted mount volume)
    175     "$TARGET" \ # remote target (see above)
    176 ```
    177 
    178 ## Final touches
    179 
    180 For an easy, semi-automated backup routine, a few additional ease of life
    181 improvements come in handy, such as mounting the reverse filesystem before
    182 backup and unmounting them afterward.
    183 
    184 Additionally, I like to send desktop notifications whenever I am using the
    185 script in a desktop environment. In order to detect this, I use the `$DISPLAY`
    186 environment variable for X11 desktops and the `$WAYLAND_DISPLAY` variable for
    187 Wayland environments. I typically use `gdbus` to send notifications, wrapped in
    188 a shell-function:
    189 
    190 ```{.sh}
    191 #!/usr/bin/env bash
    192 
    193 # notification function
    194 ## Args:
    195 ## 1. Headline
    196 ## 2. Notification ID (0 for new)
    197 ## 3. icon-name
    198 ## 4. Notification text
    199 ## 5. additional information ("[]" for none)
    200 ## 6. timeout in ms
    201 function send_notify {
    202     gdbus call --session \
    203         --dest=org.freedesktop.Notifications \
    204         --object-path=/org/freedesktop/Notifications \
    205         --method=org.freedesktop.Notifications.Notify \
    206         "$1" $2 "$3" "$1" "$4" "$5" \
    207         '{"category": <"im.received">}' $6
    208 }
    209 ```
    210 
    211 All in all the final script looks like this:
    212 
    213 ```{.sh}
    214 #!/usr/bin/env bash
    215 
    216 # configs ------------------------------
    217 
    218 # How many versions should be kept?
    219 TARGET_VERSION=$(($(date +%m) % 3))
    220 
    221 # define target backup storage (here: remote SFTP storage)
    222 TARGET="user[AT]my[DOT]remote.backup:${HOSTNAME}_${TARGET_VERSION}/"
    223 
    224 # define directories to be excluded
    225 EXCL="--exclude=Downloads/* --exclude=.cache/* --exclude=.var/* --exclude=.local/share/Trash/* --exclude=*.git/* --exclude=.davfs2/*"
    226 
    227 # functions ----------------------------
    228 
    229 # notification function
    230 ## Args:
    231 ## 1. Headline
    232 ## 2. Notification ID (0 for new)
    233 ## 3. icon-name
    234 ## 4. Notification text
    235 ## 5. additional information ("[]" for none)
    236 ## 6. timeout in ms
    237 function send_notify {
    238     gdbus call --session \
    239         --dest=org.freedesktop.Notifications \
    240         --object-path=/org/freedesktop/Notifications \
    241         --method=org.freedesktop.Notifications.Notify \
    242         "$1" $2 "$3" "$1" "$4" "$5" \
    243         '{"category": <"im.received">}' $6
    244 }
    245 
    246 # mounting encrypted filesystem --------
    247 echo "[INFO] Attempting to mount source as encrypted dir." 
    248 
    249 # create temporary directory as encrypted mountpoint
    250 ENCRYPTED_DIR=$(mktemp -d)
    251 
    252 # mount home-directory to newly created mountpoint
    253 gocryptfs \
    254     --ro \ # mount in read-only mode
    255     --reverse \ # use "reverse mode"
    256     "$HOME" \ # unencrypted directory to be backed up
    257     "$ENCRYPTED_DIR" && # temporary mount point
    258     echo "$HOME has been mounted in encrypted form to $ENCRYPTED_DIR" ||
    259     echo "Mounting $HOME to $ENCRYPTED_DIR failed!"
    260 
    261 # Transfer data ------------------------
    262 
    263 ## send notification
    264 if [[ ! -z $WAYLAND_DISPLAY ]] || [[ ! -z $DISPLAY ]]; then
    265     send_notify \
    266         "BackUpr" \
    267         0 \
    268         "document-send" \
    269         "Starting backup procedure to $TARGET" \
    270         "[]" \
    271         3000
    272 else
    273     echo "[INFO] Starting backup procedure to $TARGET"
    274 fi
    275 
    276 # back up directory
    277 rsync \
    278     --archive \ # keep ownership information in tact
    279     --verbose \ # print more information during transmission
    280     --progress \ # show progress of the transmission
    281     --delete-delay \ # remove deleted files from target after sync
    282     ${EXCL} \ # set exclusion parameters from above
    283     "$ENCRYPTED_DIR" \ # source (encrypted mount volume)
    284     "$TARGET" \ # remote target (see above)
    285 
    286 ## send notifications about status
    287 if [[ $? == 0 ]]; then
    288     if [[ ! -z $WAYLAND_DISPLAY ]] || [[ ! -z $DISPLAY ]]; then
    289         send_notify "BackUpr" 0 "document-send" "Backup finished successfully." "[]" 0
    290     else
    291         echo "[INFO] Backup finished successfully"
    292     fi
    293 else
    294     if [[ ! -z $WAYLAND_DISPLAY ]] || [[ ! -z $DISPLAY ]]; then
    295         send_notify  "BackUpr" 0 "document-send" "Backup failed!" "[]"
    296     else
    297         echo "[ERROR] Backup failed!"
    298     fi
    299 fi
    300 
    301 ## Unmount encrypted file-system
    302 fusermount -u "$ENCRYPTED_DIR"
    303 
    304 # EOF backup.sh
    305 ```
    306 
    307 ## Restoring a backup
    308 
    309 Restoring the backup is relatively easy as well. For simplicity, I'll assume
    310 that the entire backup should be restored. Take a look at `rsync`'s options if
    311 that is not what you want. Of course, you can also recover only specific
    312 directories or files.
    313 
    314 First off we need the `.gocryptfs.reverse.conf` that the encryption tool created
    315 when we initialized the file system for the first time. That file contains meta
    316 information about the encrypted storage, but crucially not the decryption
    317 password. When mounting the file system, it has been put *unencrypted in plain
    318 text* into the encrypted storage as `gocryptfs.conf` and transferred to the
    319 remote storage.
    320 
    321 In case you lost your entire local file system and want to restore it from the
    322 backup, we first need to fetch this configuration file:
    323 
    324 ```{.sh}
    325 #!/usr/bin/env bash
    326 
    327 OLD_HOSTNAME="myoldpc"
    328 TARGET_VERSION=1
    329 
    330 scp user[AT]my[DOT]remote.backup:${OLD_HOSTNAME}_${TARGET_VERSION}/gocryptfs.conf \
    331   ~/.gocryptfs.reverse.conf
    332 
    333 ```
    334 
    335 Once this is done, we can mount the encrypted file storage again, however this
    336 time with writing permissions, so we can restore the files from the remote
    337 storage into the encrypted file system:
    338 
    339 ```{.sh}
    340 #!/usr/bin/env bash
    341 
    342 # create temporary directory as encrypted mountpoint
    343 ENCRYPTED_DIR=$(mktemp -d)
    344 
    345 gocryptfs \
    346     --rw \ # mount in read-write mode
    347     --reverse \ # use "reverse mode"
    348     --config "$HOME/.gocryptfs.reverse.conf" \ # config file
    349     "$HOME" \ # unencrypted directory to be backed up
    350     "$ENCRYPTED_DIR" && # temporary mount point
    351     echo "$HOME has been mounted in encrypted form to $ENCRYPTED_DIR" ||
    352     echo "Mounting $HOME to $ENCRYPTED_DIR failed!"
    353 ```
    354 
    355 Now we can finally start to transfer the files. They will simultaneously show up
    356 as decrypted files in the home directory:
    357 
    358 ```{.sh}
    359 #!/usr/bin/env bash
    360 
    361 OLD_HOSTNAME="myoldpc"
    362 TARGET_VERSION=1
    363 
    364 # transfer files
    365 rsync \
    366     --archive \
    367     --verbose \
    368     --update \
    369     user[AT]my[DOT]remote.backup:${OLD_HOSTNAME}_${TARGET_VERSION}/ \
    370     ${ENCRYPTED_DIR}/
    371 
    372 # unmount the encrypted storage
    373 fusermount -u "$ENCRYPTED_DIR"
    374 ```