Backup – Johannes Neubauer

Backup Final Boss: Apple Photos

2. June 20258. June 2025 Johannes Neubauer Leave a comment

In my last post I gave an introduction to my overall hardware and software setup for backup. But this was the easy part. Backing up several massive Apple Photos libraries—each overflowing with iPhone snapshots, RAW captures from my system camera, videos, panoramas, and AI-generated art—turns out to be a far more complex endeavor than initially anticipated.

For the impatient… You can find the resulting repository, with one iOS solution (which does not completely satisfy my requirements) as well as a macOS solution (which only runs if my MacBook is “awake”, but besides that is an A+), here: https://github.com/juangamnik/apple_photos_backup/

Still with me? Great, let’s dive into the story… Over months of testing, scripting, and wrestling with encrypted storage quirks, a reliable, fully automated solution has finally taken shape. In this post, I’ll walk you through the entire journey: from the early missteps to the final macOS-based workflow that ensures every original file (complete with sidecar and edit metadata) eventually lands safely in an encrypted, deduplicated long-term archive. Whether you’re juggling multiple family libraries or simply want bulletproof Apple Photos backups, I hope this detailed account of trials and triumphs helps you avoid the pitfalls and understand exactly how the tooling works.

Disclaimer: This issue vanishes, if you have a windows or macOS device running 24/7 with enough disk space to deactivate “optimize space”. If true: don’t read 🤓.

By the way: Apple encourages you to backup your iCloud Photos!

Backup Strategy Overhaul

27. April 202527. April 2025 Johannes Neubauer Leave a comment

In this post I describe how and why I radically overhauled my backup strategy over the past few years – and which alternatives I consciously ruled out. It is written as a journey (what it was), which hopefully gives you helpful insight, whether my decisions could be yours or not (what is ok for me 🤓). I added takeaway sections, for more general insights.

TLDR – Too long don’t read… the final setup:

external USB 3.2 Gen 2×2 RAID enclosure (20 Gb/s) with two WD Red NVMe SSDs in Hardware RAID 1, formatted as EXT4.
initial backup (~ 1.3 TB) taken directly on the Mac via ExtFS (Paragon).
continuous service on a Raspberry Pi 4 B (8 GB RAM, Ubuntu):
- enclosure attached over USB 3.0 (5 Gb/s).
- SuperSpeed 3.1 USB C → USB 3.0 (Typ A) cable since Pi has no USB-C
- Mac → Pi is capped by Gigabit Ethernet (1 Gb/s).
BorgBackup:
- via local mount for initial backup, and
- via SSH for continuous service
- → incremental, chunk‑based deduplication before encryption, transfers only deltas.

Diffing and Patching Large Binary Files Part II

24. February 202224. February 2022 jonny Leave a comment

After reading my last post a second time I realized that without further explanation the diff-binary.sh and patch-binary.sh scripts look just like a wrapper around a specific rsync call. But there is a little bit more to it. Therefore, this post describes the rationale behind these scripts and enhances them to some extent (with input handling & hashing).

Incremental Backup of Image Files (or: How to Diff and Patch Big Binary Files)

13. February 202213. February 2022 jonny Leave a comment

More often than expected, there is a problem for which there should be an easy solution, but a short googling session lets you behind with the hollow feeling that the world let you down… again. But then you put out your unix skills to find a solution for the problem on your own.

Update: There is a Part II to this post, which explains the idea behind the solution shown here

Today is such a day… The problem is as follows: you backup a disk (e.g. the sdcard of a raspberry pi) with dd like this:

$ sudo dd if=/dev/mmcblk0 of=/media/backup/yyyymmdd-raspi-homebridge.img bs=1M

A backup with dd is a bitwise copy, which takes exactly the space of the disk, no matter how empty the block device is. I.e., the dd-image of an sdcard with nominally 16GB takes about 15GB (the usable space of the disk). If the device is more or less empty, the image consists of a lot of zeros and can be compressed with tools like bzip2 very well. In your (i.e., my) case 6 GB are used on the disk. After compressing the image it is less than 2 GB. Sounds great, right? Unfortunately, you are paranoid and want to store the last X backups. Even with a small X, this can get really hungry on your cloud storage. This is the time where your inner voice says: Wouldn’t it be great to store the delta of an old to a new backup, only?

That means, you store the complete (compressed) backup of the most current backup, as it is most likely, that you need it than older ones. The older backups are just deltas to the next-newer backup. Each time a new backup is created, the predecessor image is replaced by a diff/delta between it and the new backup.

There must be a solution for this, right? Meh, at least I couldn’t find that solution. If you found it, please comment below. So, I started some experiments…

Find Missing Files in a Backup 2nd Iteration

2. December 20212. December 2021 jonny Leave a comment

I already wrote a post about this topic. But as I do at work, I work in an agile manner at home. So here is an update to the post Find Missing Files in a Backup. The script there has been designed to be copied from the clipboard into the terminal. This time, I present you a script, which you may copy to a file, make it executable and reuse it easily. Further on, it fixes some minor issues with the original version (e.g. handling files with spaces and backup folders which are named differently, than the original folder).

For the more general idea of this script, please have a look at the former blog post (see link above). So here it is:

#!/bin/bash

src=$1
tgt=$2

(cd ${src} && ls) | while read file; do
  found=$(find $tgt -name "$file" | wc -l)
  if [ $found -eq 0 ]; then
    echo $file
  fi
done

Copy this into a file e.g. named backup-check.sh and give it exec rights:

$ chmod ugo+x backup-check.sh

Afterwards you can use it like this:

$ ./backup-check.sh original/ backup/

Exciting 🤓.

Find Missing Files in a Backup

22. February 202122. February 2021 jonny Leave a comment

Long time no write… I had many ideas for blog posts, but no time to write them. Hopefully, this will change soon. Here just a small update… As you know (if you read my blog), I have a “special” way of storing and “backupping” my files. All my documents are stored in a folder named YYYY/ (e.g. 2021) and have the format YYYYMMDD-<some-name>.<some-ending>. I further categorize my files by using macOS tags.

Every year I prepare a folder tax-YYYY/, where I copy all the files with relevance to my tax declaration. But sometimes I am not completely sure whether all the files in there are in my main YYYY/ folders, too. This is, because I copy files from my e-mail (like invoices) and some files, which relate to year YYYY are from YYYY+1, e.g. the proof of social security. So, after finishing the tax declaration I double check, whether every file that belongs to a year made its way into the corresponding folder with a dedicated unix command (here exemplarily for 2019). For sure this can be used to check the completeness of backup folders, too. A recursive variant may even be used to search complete backups for missing duplicates/files. So here is the command.

ls 2019* | while read file; do; \
found=$(find ../2019 -name $file | wc -l); \
if [ $found -eq 0 ]; then; echo $file; fi; \
done

Be aware, that this does only search for a file with the same name. It does not check whether it is not the same file (e.g. due to file changes). This can be done by using a hashing algorithm

Exciting 🤓.