Shell – Johannes Neubauer

Backup Strategy Overhaul

27. April 202527. April 2025 Johannes Neubauer Leave a comment

In this post I describe how and why I radically overhauled my backup strategy over the past few years – and which alternatives I consciously ruled out. It is written as a journey (what it was), which hopefully gives you helpful insight, whether my decisions could be yours or not (what is ok for me 🤓). I added takeaway sections, for more general insights.

TLDR – Too long don’t read… the final setup:

external USB 3.2 Gen 2×2 RAID enclosure (20 Gb/s) with two WD Red NVMe SSDs in Hardware RAID 1, formatted as EXT4.
initial backup (~ 1.3 TB) taken directly on the Mac via ExtFS (Paragon).
continuous service on a Raspberry Pi 4 B (8 GB RAM, Ubuntu):
- enclosure attached over USB 3.0 (5 Gb/s).
- SuperSpeed 3.1 USB C → USB 3.0 (Typ A) cable since Pi has no USB-C
- Mac → Pi is capped by Gigabit Ethernet (1 Gb/s).
BorgBackup:
- via local mount for initial backup, and
- via SSH for continuous service
- → incremental, chunk‑based deduplication before encryption, transfers only deltas.

Diffing and Patching Large Binary Files Part II

24. February 202224. February 2022 jonny Leave a comment

After reading my last post a second time I realized that without further explanation the diff-binary.sh and patch-binary.sh scripts look just like a wrapper around a specific rsync call. But there is a little bit more to it. Therefore, this post describes the rationale behind these scripts and enhances them to some extent (with input handling & hashing).

Incremental Backup of Image Files (or: How to Diff and Patch Big Binary Files)

13. February 202213. February 2022 jonny Leave a comment

More often than expected, there is a problem for which there should be an easy solution, but a short googling session lets you behind with the hollow feeling that the world let you down… again. But then you put out your unix skills to find a solution for the problem on your own.

Update: There is a Part II to this post, which explains the idea behind the solution shown here

Today is such a day… The problem is as follows: you backup a disk (e.g. the sdcard of a raspberry pi) with dd like this:

$ sudo dd if=/dev/mmcblk0 of=/media/backup/yyyymmdd-raspi-homebridge.img bs=1M

A backup with dd is a bitwise copy, which takes exactly the space of the disk, no matter how empty the block device is. I.e., the dd-image of an sdcard with nominally 16GB takes about 15GB (the usable space of the disk). If the device is more or less empty, the image consists of a lot of zeros and can be compressed with tools like bzip2 very well. In your (i.e., my) case 6 GB are used on the disk. After compressing the image it is less than 2 GB. Sounds great, right? Unfortunately, you are paranoid and want to store the last X backups. Even with a small X, this can get really hungry on your cloud storage. This is the time where your inner voice says: Wouldn’t it be great to store the delta of an old to a new backup, only?

That means, you store the complete (compressed) backup of the most current backup, as it is most likely, that you need it than older ones. The older backups are just deltas to the next-newer backup. Each time a new backup is created, the predecessor image is replaced by a diff/delta between it and the new backup.

There must be a solution for this, right? Meh, at least I couldn’t find that solution. If you found it, please comment below. So, I started some experiments…

Find Missing Files in a Backup 2nd Iteration

2. December 20212. December 2021 jonny Leave a comment

I already wrote a post about this topic. But as I do at work, I work in an agile manner at home. So here is an update to the post Find Missing Files in a Backup. The script there has been designed to be copied from the clipboard into the terminal. This time, I present you a script, which you may copy to a file, make it executable and reuse it easily. Further on, it fixes some minor issues with the original version (e.g. handling files with spaces and backup folders which are named differently, than the original folder).

For the more general idea of this script, please have a look at the former blog post (see link above). So here it is:

#!/bin/bash

src=$1
tgt=$2

(cd ${src} && ls) | while read file; do
  found=$(find $tgt -name "$file" | wc -l)
  if [ $found -eq 0 ]; then
    echo $file
  fi
done

Copy this into a file e.g. named backup-check.sh and give it exec rights:

$ chmod ugo+x backup-check.sh

Afterwards you can use it like this:

$ ./backup-check.sh original/ backup/

Exciting 🤓.

Azure DevOps Wiki Export

5. June 20215. June 2021 jonny Leave a comment

Lately I needed to export an Azure DevOps wiki as one PDF. There is a plugin that claims, that it can do this and of course you can export each page in the browser and concat them with tools like pdftk. Unfortunately, the plugin is in a very early stage and I did not have any control over the Azure DevOps instance. The latter felt like loosing…

Hence, I searched for a “computer scientist”-solution. So I downloaded the repo and installed pandoc.

Repair a Damaged Package System after Ubuntu Dist-Upgrade

5. January 20195. January 2019 jonny Leave a comment

Happy new year.

My blog runs on a VM at Hetzner with an Ubuntu LTS system. That means 5 years of support… I was running trusty from 2014, so there should be support until 2019. But not every open source software has given you this promise, just the Ubuntanians. So, support for Owncloud run out last year and I thought that the days between years are a good time to switch to a new version.

Hence, I did two dist-upgrades after another from trusty to xenial and from xenial to the current LTS version bionic (every 2 years a new LTS version is coming out). The first upgrade was “successful” with a lot of need for adaption in the configurations afterwards. Then after everything worked again, I did another upgrade, which failed because of this issue.

You do not want your system showing you such a message during `do-release-update`.

That is, I had to fix a distro upgrade that failed in between… challenge accepted 🤓.

Change c-time on Unix-Based Systems Based on Filenames

5. January 20195. January 2019 jonny Leave a comment

For quite some time I have a paper-free office (at home). I still physically file the papers I get, but in addition I scan all the paper documents, tag them and put them in a folder. I use a very easy system. For the very recent documents (and the ones work in progress) I have a draft folder. Furthermore, there is exactly one document folder per year and I store everything in there (incoming and outgoing documents, scanned ones and ones that I get mailed, even some printed to PDF emails for document-like emails). Each file has a common naming scheme. There is one part that is relevant for this post: at the beginning of each file I put the date of the document in the format YYYYMMDD. This way, the documents are ordered chronologically in a year, if I sort them by name. There is a lot more to my filing system and if someone is interested, please leave a comment, but for this post, this should be enough about my way of filing documents (digitally).

The issue I would like to address here is, that the date when I scanned a file and the “real” date of the document diverges. Sometimes it even happens, that the creation time of two scanned files are in “the real world” in one order, but the scan-/creation time is the other way around. I do not like this situation. Therefore, each year when I “finish the year”, I run a script (on macOS), which adapts the ctimeto the date-part in the name of the file (a one-liner, which I put on 5 lines, for better readability):

find . -name "2017*" | while read file; \
  do thedate=$(echo "$file" | \
  sed -E 's/^[^0-9]*([0-9]+).*$/\1/'); \
  touch -t ${thedate}0000 $file; \
  done

If you have another unix-based System with sed you can use -r instead of -E. I am unsure why this option behaves differently on macOS although I installed (and use) GNU sed installed via home brew.

Exciting 🤓.

Backup Rotation with Date

19. May 201119. May 2011 jonny Leave a comment

Again (see ring buffer), this is a post, which shows an approach I will not need, since my backup script will change encore un fois.

I began to write a backup script using hard links and rsync in order to have incremental backups inspired by incremental backups with rsync. But after a little bit of hassle with bash script a friend gave me the hint that rsnapshot is out there and this little perl tool is either enlightened by Mike Rubels approach. So I will have to change my backup strategy from ‘push’ to ‘pull’ and make some other subtle changes, but don’t have to bother with ring buffers and backup rotation, because rsnapshot has already worked that out for me.
Read More

Ring Buffers with Bash Script

13. May 201113. May 2011 jonny Leave a comment

This post is a little bit infantile, since it does present a really, really simple thing, but I wrote the shell code yesterday for a backup script and it seems as I will not need it. So I write it down here so either I do not forget it (who knows when it will come in handy?) or perhaps someone else says: “Hey my script could be much easier using this”. In the next days I will post the code that replaces the code shown in this post, so stay tuned.
Read More

Generate HTML from Colored Terminal for Sharing Diffs

4. April 20114. April 2011 jonny Leave a comment

The version control system git has the nice feature git diff --color-words which shows on a word by word basis the changes, coloring new words green and deleted ones red. The script ansi2html.sh converts a colored xterm output to html. This way you are able to share your diffs with others: hello world example. For $latex \LaTeX$ you may use latexdiff, which highlights the changes in the generated PDF/DVI-output.